@@ -13,45 +13,61 @@ double-colon `::`).
1313
1414## Source Text
1515
16- SourceCharacter :: "Any Unicode character"
16+ SourceCharacter :: / [ \u0009\u000A\u000D\u0020-\uFFFF ] /
1717
1818GraphQL documents are expressed as a sequence of
1919[ Unicode] ( http://unicode.org/standard/standard.html ) characters. However, with
20- few exceptions, most of GraphQL is expressed only in the original ASCII range
21- so as to be as widely compatible with as many existing tools, languages, and
22- serialization formats as possible. Other than within comments, Non-ASCII Unicode
23- characters are only found within {StringValue}.
20+ few exceptions, most of GraphQL is expressed only in the original non-control
21+ ASCII range so as to be as widely compatible with as many existing tools,
22+ languages, and serialization formats as possible and avoid display issues in
23+ text editors and source control.
24+
25+
26+ ### Unicode
27+
28+ UnicodeBOM :: "Byte Order Mark (U+FEFF)"
29+
30+ Non-ASCII Unicode characters may freely appear within {StringValue} and
31+ {Comment} portions of GraphQL.
32+
33+ The "Byte Order Mark" is a special Unicode character which
34+ may appear at the beginning of a file containing Unicode which programs may use
35+ to determine the fact that the text stream is Unicode, what endianness the text
36+ stream is in, and which of several Unicode encodings to interpret.
2437
2538
2639### White Space
2740
2841WhiteSpace ::
2942 - "Horizontal Tab (U+0009)"
30- - "Vertical Tab (U+000B)"
31- - "Form Feed (U+000C)"
3243 - "Space (U+0020)"
33- - "No-break Space (U+00A0)"
3444
3545White space is used to improve legibility of source text and act as separation
3646between tokens, and any amount of white space may appear before or after any
3747token. White space between tokens is not significant to the semantic meaning of
3848a GraphQL query document, however white space characters may appear within a
3949{String} or {Comment} token.
4050
51+ Note: GraphQL intentionally does not consider Unicode "Zs" category characters
52+ as white-space, avoiding misinterpretation by text editors and source
53+ control tools.
4154
4255### Line Terminators
4356
4457LineTerminator ::
4558 - "New Line (U+000A)"
46- - "Carriage Return (U+000D)"
47- - "Line Separator (U+2028)"
48- - "Paragraph Separator (U+2029)"
59+ - "Carriage Return (U+000D)" [ lookahead ! "New Line (U+000A)" ]
60+ - "Carriage Return (U+000D)" "New Line (U+000A)"
4961
5062Like white space, line terminators are used to improve the legibility of source
5163text, any amount may appear before or after any other token and have no
5264significance to the semantic meaning of a GraphQL query document. Line
5365terminators are not found within any other token.
5466
67+ Note: Any error reporting which provide the line number in the source of the
68+ offending syntax should use the preceding amount of {LineTerminator} to produce
69+ the line number.
70+
5571
5672### Comments
5773
@@ -101,9 +117,11 @@ defined here in a lexical grammar by patterns of source Unicode characters.
101117Tokens are later used as terminal symbols in a GraphQL query document syntactic
102118grammars.
103119
120+
104121### Ignored Tokens
105122
106123Ignored ::
124+ - UnicodeBOM
107125 - WhiteSpace
108126 - LineTerminator
109127 - Comment
@@ -639,17 +657,46 @@ StringValue ::
639657
640658StringCharacter ::
641659 - SourceCharacter but not ` " ` or \ or LineTerminator
642- - \ EscapedUnicode
660+ - \u EscapedUnicode
643661 - \ EscapedCharacter
644662
645- EscapedUnicode :: u /[ 0-9A-Fa-f] {4}/
663+ EscapedUnicode :: /[ 0-9A-Fa-f] {4}/
646664
647665EscapedCharacter :: one of ` " ` \ ` / ` b f n r t
648666
649- Strings are lists of characters wrapped in double-quotes ` " ` . (ex.
667+ Strings are sequences of characters wrapped in double-quotes ( ` " ` ) . (ex.
650668` "Hello World" ` ). White space and other otherwise-ignored characters are
651669significant within a string value.
652670
671+ Note: Unicode characters are allowed within String value literals, however
672+ GraphQL source must not contain some ASCII control characters so escape
673+ sequences must be used to represent these characters.
674+
675+ ** Semantics**
676+
677+ StringValue :: ` "" `
678+
679+ * Return an empty Unicode character sequence.
680+
681+ StringValue :: ` " ` StringCharacter+ ` " `
682+
683+ * Return the Unicode character sequence of all {StringCharacter}
684+ Unicode character values.
685+
686+ StringCharacter :: SourceCharacter but not ` " ` or \ or LineTerminator
687+
688+ * Return the character value of {SourceCharacter}.
689+
690+ StringCharacter :: \u EscapedUnicode
691+
692+ * Return the character value represented by the UTF16 hexidecimal
693+ identifier {EscapedUnicode}.
694+
695+ StringCharacter :: \ EscapedCharacter
696+
697+ * Return the character value of {EscapedCharacter}.
698+
699+
653700#### Enum Value
654701
655702EnumValue : Name but not ` true ` , ` false ` or ` null `
0 commit comments