Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions spec/fluent.ebnf
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

/* An FTL file defines a Resource consisting of Entries. */
Resource ::= (Entry | blank_block | junk_line)*
Resource ::= (Entry | blank_block | Junk)*

/* Entries are the main building blocks of Fluent. They define translations and
* contextual and semantic information about the translations. During the AST
Expand All @@ -18,7 +18,13 @@ Term ::= "-" Identifier blank_inline? "=" blank_inline? Value Att
* the AST construction. */
CommentLine ::= ("###" | "##" | "#") ("\u0020" /.*/)? line_end

/* Adjacent junk_lines are joined into FTL.Junk during the AST construction. */
/* Junk represents unparsed content.
*
* Junk is parsed line-by-line until a line is found which looks like it might
* be a beginning of a new message, term, or a comment. Any whitespace
* following a broken Entry is also considered part of Junk.
*/
Junk ::= junk_line (junk_line - "#" - "-" - [a-zA-Z])*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rendering to ebnf doesn't make any sense, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. We're using the EBNF syntax as defined in the XML spec, with an extension of allowing regexes in a few places. The XML one reads:

 A - B
    matches any string that matches A but does not match B.

So junk_line - "#" matches a line of junk which doesn't start with a #. I think that's exactly what we want to say here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, - is defined usefully only on single character productions in the XML spec. Or literally using matches any string that matches A but does not match B, # foo\n does match junkline, but it doesn't match #, so it's a junk line.

Copy link
Contributor Author

@stasm stasm Nov 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# foo\n does match junkline, but it doesn't match #, so it's a junk line.

Can you rephrase this please?

To me, the EBNF in this PR clearly expresses the intent. This is already a slippery slope because we're trying to define how to parse unparsed content. I don't want to overthink it. I'm also not sure how you'd like to write this differently. I could look at a PR if you'd like to prepare one :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took the literal quote from the xml spec, and replaced A and B with junk_line and #, resp. And tested that against a candidate line line # foo\n.

I know that you don't believe in the value of the EBNF, and I don't care much either about this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took the literal quote from the xml spec, and replaced A and B with junk_line and #, resp. And tested that against a candidate line line # foo\n.

Ah, I see what you mean, thanks. # matches # foo\n partially and that's enough for a negative lookahead to work here. I guess we could try to refactor this into something like sequence(and(not("#"), any_char), junk_line) but it would require special handling of blank lines inside of junk. (any_char doesn't parse newlines.) All in all, I favor the expressiveness of the approach I implemented in this PR.

junk_line ::= /[^\n]*/ ("\u000A" | EOF)

/* Attributes of Messages and Terms. */
Expand Down
4 changes: 0 additions & 4 deletions syntax/abstract.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ export function list_into(Type) {
always(new FTL.Resource(
entries
.reduce(join_adjacent(
FTL.Junk,
FTL.Comment,
FTL.GroupComment,
FTL.ResourceComment), [])
Expand Down Expand Up @@ -155,9 +154,6 @@ function join_of_type(Type, ...elements) {
case FTL.ResourceComment:
return elements.reduce((a, b) =>
new Type(a.content + "\n" + b.content));
case FTL.Junk:
return elements.reduce((a, b) =>
new Type(a.content + b.content));
}
}

Expand Down
27 changes: 22 additions & 5 deletions syntax/grammar.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ let Resource = defer(() =>
either(
Entry,
blank_block,
junk_line))
Junk))
.chain(list_into(FTL.Resource)));

/* ------------------------------------------------------------------------- */
Expand Down Expand Up @@ -84,16 +84,33 @@ let CommentLine = defer(() =>
.map(keep_abstract)
.chain(list_into(FTL.Comment)));

/* ------------------------------------------------------------------------- */
/* Adjacent junk_lines are joined into FTL.Junk during the AST construction. */
/* -------------------------------------------------------------------------- */
/* Junk represents unparsed content.
*
* Junk is parsed line-by-line until a line is found which looks like it might
* be a beginning of a new message, term, or a comment. Any whitespace
* following a broken Entry is also considered part of Junk.
*/
let Junk = defer(() =>
sequence(
junk_line,
repeat(
and(
not(charset("a-zA-Z")),
not(string("-")),
not(string("#")),
junk_line)))
.map(flatten(1))
.map(join)
.chain(into(FTL.Junk)));

let junk_line =
sequence(
regex(/[^\n]*/),
either(
string("\u000A"),
eof()))
.map(join)
.chain(into(FTL.Junk));
.map(join);

/* --------------------------------- */
/* Attributes of Messages and Terms. */
Expand Down
4 changes: 2 additions & 2 deletions test/fixtures/astral.json
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@
{
"type": "Junk",
"annotations": [],
"content": "err-😂 = Value\n"
"content": "err-😂 = Value\n\n"
},
{
"type": "Comment",
Expand All @@ -163,7 +163,7 @@
{
"type": "Junk",
"annotations": [],
"content": "err-invalid-expression = { 😂 }\n"
"content": "err-invalid-expression = { 😂 }\n\n"
},
{
"type": "Comment",
Expand Down
20 changes: 15 additions & 5 deletions test/fixtures/call_expressions.json
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@
{
"type": "Junk",
"annotations": [],
"content": "mixed-case-callee = {Function()}\n"
"content": "mixed-case-callee = {Function()}\n\n"
},
{
"type": "Comment",
Expand All @@ -88,7 +88,7 @@
{
"type": "Junk",
"annotations": [],
"content": "variable-callee = {$variable()}\n"
"content": "variable-callee = {$variable()}\n\n"
},
{
"type": "GroupComment",
Expand Down Expand Up @@ -323,7 +323,7 @@
{
"type": "Junk",
"annotations": [],
"content": "shuffled-args = {FUN(1, x: 1, \"a\", y: \"Y\", msg)}\n"
"content": "shuffled-args = {FUN(1, x: 1, \"a\", y: \"Y\", msg)}\n\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking about these ^^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! I'm not happy with the answer, but the good news is that this PR will help in making this right.

The reference fixtures in fluent-syntax are copied directly from the reference parser's tests. They're not generated by the tooling parser like the other kinds of fixtures. I've taken a note to document this in a README.

So what you're seeing in fluent-syntax/test/fixtures_reference are actually the reference parser's fixtures from Syntax 0.7.

The actual output of the tooling parser includes those trailing blank lines. You can verify that in the Playground. The fixtures_reference tests pass in fluent-syntax because the test runner explicitly ignores Junk due to its being parsed differently in Syntax 0.7. With this PR, we're getting much closer to being able to test junk too :)

},
{
"type": "Comment",
Expand All @@ -332,7 +332,7 @@
{
"type": "Junk",
"annotations": [],
"content": "duplicate-named-args = {FUN(x: 1, x: \"X\")}\n"
"content": "duplicate-named-args = {FUN(x: 1, x: \"X\")}\n\n\n"
},
{
"type": "GroupComment",
Expand Down Expand Up @@ -1063,7 +1063,17 @@
{
"type": "Junk",
"annotations": [],
"content": "one-argument = {FUN(1,,)}\nmissing-arg = {FUN(,)}\nmissing-sparse-arg = {FUN( , )}\n"
"content": "one-argument = {FUN(1,,)}\n"
},
{
"type": "Junk",
"annotations": [],
"content": "missing-arg = {FUN(,)}\n"
},
{
"type": "Junk",
"annotations": [],
"content": "missing-sparse-arg = {FUN( , )}\n\n\n"
},
{
"type": "GroupComment",
Expand Down
2 changes: 1 addition & 1 deletion test/fixtures/crlf.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
{
"type": "Junk",
"annotations": [],
"content": "err03 = { \"str\r\n"
"content": "err03 = { \"str\r\n\r\n"
},
{
"type": "Comment",
Expand Down
2 changes: 1 addition & 1 deletion test/fixtures/escaped_characters.json
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@
{
"type": "Junk",
"annotations": [],
"content": "unknown-escape = {\"\\x\"}\n"
"content": "unknown-escape = {\"\\x\"}\n\n"
},
{
"type": "GroupComment",
Expand Down
19 changes: 18 additions & 1 deletion test/fixtures/junk.ftl
Original file line number Diff line number Diff line change
@@ -1,4 +1,21 @@
## Two adjacent Junks.
err01 = {1x}
err02 = {2x}

# A single Junk.
err03 = {1x
2

# A single Junk.
ą=Invalid identifier
ć=Another one

key01 = {
# The COMMENT ends this junk.
err04 = {
# COMMENT

# The COMMENT ends this junk.
# The closing brace is a separate Junk.
err04 = {
# COMMENT
}
57 changes: 55 additions & 2 deletions test/fixtures/junk.json
Original file line number Diff line number Diff line change
@@ -1,15 +1,68 @@
{
"type": "Resource",
"body": [
{
"type": "GroupComment",
"content": "Two adjacent Junks."
},
{
"type": "Junk",
"annotations": [],
"content": "err01 = {1x}\n"
},
{
"type": "Junk",
"annotations": [],
"content": "err02 = {2x}\n\n"
},
{
"type": "Comment",
"content": "A single Junk."
},
{
"type": "Junk",
"annotations": [],
"content": "err03 = {1x\n2\n\n"
},
{
"type": "Comment",
"content": "A single Junk."
},
{
"type": "Junk",
"annotations": [],
"content": "ą=Invalid identifier\nć=Another one\n"
"content": "ą=Invalid identifier\nć=Another one\n\n"
},
{
"type": "Comment",
"content": "The COMMENT ends this junk."
},
{
"type": "Junk",
"annotations": [],
"content": "err04 = {\n"
},
{
"type": "Comment",
"content": "COMMENT"
},
{
"type": "Comment",
"content": "The COMMENT ends this junk.\nThe closing brace is a separate Junk."
},
{
"type": "Junk",
"annotations": [],
"content": "err04 = {\n"
},
{
"type": "Comment",
"content": "COMMENT"
},
{
"type": "Junk",
"annotations": [],
"content": "key01 = {\n"
"content": "}\n"
}
]
}
12 changes: 6 additions & 6 deletions test/fixtures/leading_dots.json
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@
{
"type": "Junk",
"annotations": [],
"content": " .Continued\n"
"content": " .Continued\n\n"
},
{
"type": "Comment",
Expand All @@ -182,7 +182,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key08 =\n .Value\n"
"content": "key08 =\n .Value\n\n"
},
{
"type": "Comment",
Expand All @@ -191,7 +191,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key09 =\n .Value\n Continued\n"
"content": "key09 =\n .Value\n Continued\n\n"
},
{
"type": "Message",
Expand Down Expand Up @@ -410,7 +410,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key16 =\n { 1 ->\n *[one]\n .Value\n }\n"
"content": "key16 =\n { 1 ->\n *[one]\n .Value\n }\n\n"
},
{
"type": "Comment",
Expand All @@ -419,7 +419,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key17 =\n { 1 ->\n *[one] Value\n .Continued\n }\n"
"content": "key17 =\n { 1 ->\n *[one] Value\n .Continued\n }\n\n"
},
{
"type": "Comment",
Expand All @@ -428,7 +428,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key18 =\n.Value\n"
"content": "key18 =\n.Value\n\n"
},
{
"type": "Message",
Expand Down
7 changes: 6 additions & 1 deletion test/fixtures/member_expressions.json
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,12 @@
{
"type": "Junk",
"annotations": [],
"content": "variant-expression = {msg[case]}\nattribute-expression = {-term.attr}\n"
"content": "variant-expression = {msg[case]}\n"
},
{
"type": "Junk",
"annotations": [],
"content": "attribute-expression = {-term.attr}\n"
}
]
}
2 changes: 1 addition & 1 deletion test/fixtures/messages.json
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@
{
"type": "Junk",
"annotations": [],
"content": "key07 =\n"
"content": "key07 =\n\n"
},
{
"type": "Comment",
Expand Down
4 changes: 2 additions & 2 deletions test/fixtures/mixed_entries.json
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
{
"type": "Junk",
"annotations": [],
"content": "ą=Invalid identifier\nć=Another one\n"
"content": "ą=Invalid identifier\nć=Another one\n\n"
},
{
"type": "Message",
Expand Down Expand Up @@ -91,7 +91,7 @@
{
"type": "Junk",
"annotations": [],
"content": " .attr = Dangling attribute\n"
"content": " .attr = Dangling attribute\n\n"
},
{
"type": "Message",
Expand Down
6 changes: 3 additions & 3 deletions test/fixtures/placeables.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@
{
"type": "Junk",
"annotations": [],
"content": "unmatched-open1 = { 1\n"
"content": "unmatched-open1 = { 1\n\n"
},
{
"type": "Comment",
Expand All @@ -89,7 +89,7 @@
{
"type": "Junk",
"annotations": [],
"content": "unmatched-open2 = {{ 1 }\n"
"content": "unmatched-open2 = {{ 1 }\n\n"
},
{
"type": "Comment",
Expand All @@ -98,7 +98,7 @@
{
"type": "Junk",
"annotations": [],
"content": "unmatched-close1 = 1 }\n"
"content": "unmatched-close1 = 1 }\n\n"
},
{
"type": "Comment",
Expand Down
Loading