-
Couldn't load subscription status.
- Fork 4
Parser improvements — including full Json compatibility! #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Having the tests output the Json source under test when something fails makes these tests easier to work with.
Tackle a large group of untested `SKIP_NEEDS_INVESTIGATION` JsonSuite tests by verifying they can indeed be set to our tested state of `ACCEPT_N_FOR_SUPERSET`, enhancing coverage
Implement our rules for control characters in strings. These are based closely on Json's rule of disallowing all control characters (i.e. they may only be included in strings in escaped form). See https://www.rfc-editor.org/rfc/rfc8259.html#section-7 for more details. Our rules are similar, except we allow whitespace control characters to be embedded in our strings. Note that this also fixed a bug in the Lexer: we had hijacked the null byte to mean EOF, but that made parsing strings that actually _contain_ the null byte parse incorrectly.
Inspired by b0c1f05, stop hijacking the null byte to mean EOF in our NumberParser. Note: because this was private to the scanner in this case and not actually consulted to check for EOF, there wasn't a true bug in this... yet. It's still a bad idea to overload the meaning of the null byte, so clean this up while we're thinking of it. Also `getCurrentChar` to `peek` while we're in there.
Add a configurable guard against excessive nesting in Kson to gracefully handle even the most egregious attempts to blow out our stack
This test is enabled as with ACCEPT_FOR_KSON since we formalize in b0c1f05 that we accept unescaped whitespace control characters
Ensure that trying to parse an empty Kson file produces a helpful/appropriate error. Also ensure that the empty file error doesn't bubble up to our editor where it makes no sense to complain about an empty file while someone is editing it — it's only a problem once they try to use it.
Tighten up parsing to handle encountering a `}` or `]` when no object or list has been opened.
We behave well on these, not erroring, so they no longer need to be skipped
We now parse and validate string escapes very closely to the rules outlined in [RFC8259](https://www.rfc-editor.org/rfc/rfc8259.html#section-7) This gets our Json compatibility (as measured by [JSONTestSuite](https://github.com/nst/JSONTestSuite) to near completion. Site note: as of this commit, we no longer process any of these escapes as part of parsing. We simply validate and store in the resulting AST, ready to process esapes if/when needed in some future AST transform.
This is mostly a formality as the [JSONTestSuite](https://github.com/nst/JSONTestSuite) tests did not change between these commits, but at least now it's clear at a glance how recently we verified we had all the latest tests
We now properly detect (and report errors for) unexpected/illegal characters encountered during parsing. This change means two excellent things: - our parsing should now be very robust in the face of nonsense input, parsing all tokens generated and not crashing on unrecognized chars - we are an [RFC8259-compliant](https://www.rfc-editor.org/rfc/rfc8259.html) Json parser according to [JSONTestSuite](https://github.com/nst/JSONTestSuite) (modulo the cases noted in `JsonTestSuiteEditList` that we accept as superset of Json)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A collection of parser improvements representing a great leap in the maturity and robustness of the Kson parser and summing up to an important milestone:
We are now an RFC8259-compliant Json parser according to JSONTestSuite (modulo the cases noted in
JsonTestSuiteEditListthat we accept as superset of Json)See individual commits for more detail these changes