Skip to content

Commit 8b0544b

Browse files
committed
remove emoji identifiers
Some unicode characters or character groups lead to a large increase in parser size. This change halves the size of the generated parser file.
1 parent 73d1539 commit 8b0544b

File tree

2 files changed

+2
-5
lines changed

2 files changed

+2
-5
lines changed

grammar.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -894,7 +894,8 @@ module.exports = grammar({
894894
// Some symbols in Sm and So unicode categories that are identifiers
895895
const validMathSymbols = '°∀-∇∎-∑∫-∳';
896896

897-
const start = `[_\\p{XID_Start}${validMathSymbols}\\p{Emoji}&&[^0-9#*]]`;
897+
// Emojis are currently not supported because they double the parser size
898+
const start = `[_\\p{XID_Start}${validMathSymbols}&&[^0-9#*]]`;
898899
const rest = `[^"'\`\\s\\.\\-\\[\\]${nonIdentifierCharacters}]*`;
899900
return new RegExp(start + rest);
900901
},

test/corpus/expressions.txt

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,6 @@ x′
1111
logŷ
1212
ϵ
1313
ŷ
14-
🙋
15-
🦀
1614

1715
---
1816

@@ -27,8 +25,6 @@ logŷ
2725
(identifier)
2826
(identifier)
2927
(identifier)
30-
(identifier)
31-
(identifier)
3228
(identifier))
3329

3430

0 commit comments

Comments
 (0)