Skip to content

[ fix #349 ] Make parser reentrant by using reentrant lexer #351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 2, 2021

Conversation

andreasabel
Copy link
Member

[ fix #349 ] Make parser reentrant by using reentrant lexer

Further:

  • let bison generate Bison.h, to slim down Parser.h

    Parser.h defined things that are only for internal use in the lexer.
    These can be automatically defined by bison using the %defines pragma.

  • C parser: return parse result in parameter to yyparse

    This is more robust than using global variables, prepares for making
    parser reentrant.

  • C parser: code segment after %union, entrypoints in the end

    Reorganized the input for bison a bit, to prepare for removal of
    global YY_RESULT variables.

    Got rid of the forward declaration for yyerror.

    Entrypoints can be at the end of the file, since nothing depends on them.

  • C lexer: use flex's predefined start state INITIAL

    Use INITIAL instead of defining our own YYINITIAL.

@andreasabel andreasabel linked an issue Mar 24, 2021 that may be closed by this pull request
@andreasabel andreasabel added C lexer Concerning the generated lexer parser Issues concerning parser generating labels Mar 24, 2021
@andreasabel andreasabel self-assigned this Mar 24, 2021
@wangjia184
Copy link

wangjia184 commented Mar 25, 2021

Tested in linux/x64/gcc and macos/arm64/clang
neither works

random errors when run my test cases

I tried cargo test -- --test-threads=1, this command should run test case one by one, the first one succeeds, then it fails.

Typical errors from MacOS/ARM64

test normalize::ppar_tests::ppar_should_accumulate_free_counts_from_both_branches ... ok
test normalize::ppar_tests::ppar_should_compile_both_branches_with_same_environment ... rholang_parser-83f0bbd31c1724da(62995,0x10503bd40) malloc: *** error for object 0x15b808200: pointer being freed was not allocated
rholang_parser-83f0bbd31c1724da(62995,0x10503bd40) malloc: *** set a breakpoint in malloc_error_break to debug

Typical errors from Linux/x64

test normalize::ppar_tests::ppar_should_accumulate_free_counts_from_both_branches ... ok
free(): invalid pointer
test normalize::ppar_tests::ppar_should_accumulate_free_counts_from_both_branches ... ok
test normalize::ppar_tests::ppar_should_compile_both_branches_with_same_environment ... free(): double free detected in tcache 2

From the errors, it seems double free issue, some pointer might be freed twice.
But in fact, my app does not call free() since I changed my code to run the c parser in a short-alive child process (but when I run test cases, it is in current process)

@andreasabel
Copy link
Member Author

I realized there is still a global variable in the lexer for lexing strings, it is literal_buffer. I have to make this local as well...

@andreasabel
Copy link
Member Author

@wangjia184: I updated the PR to remove a global variable in the lexer I had overlooked previously. Could you please test again?

@wangjia184
Copy link

@andreasabel still same error.
In generated files, I see change like this one.
image

Just make sure I updated my BNFC

@andreasabel
Copy link
Member Author

@wangjia184 : Thanks, yes this was the last change I did.

I am a bit out of ideas now how to proceed, since I don't have some infrastructure for running the parser in parallel.

  • Could you share a small test framework with me where I can debug the parallelism problem myself?
  • Alternatively, could you find out how to change the output of BNFC, in particular the generated .l and .y files, so that the parser works in parallel? I could then do the modification of BNFC accordingly.

@andreasabel andreasabel force-pushed the issue349 branch 2 times, most recently from d3e1c59 to f825515 Compare April 2, 2021 00:12
@andreasabel andreasabel added the C++ label Apr 2, 2021
@andreasabel
Copy link
Member Author

This project spilled over to the C++ backends. Since the work on the C backend broke the C++ backends, I took the forward route to (re)integrate the C++ lexer/parser generators into the C backend. This way, we also get reentrant parsers for C++.

Use INITIAL instead of defining our own YYINITIAL.
Saves us the initial BEGIN, works also for reentrant lexer.
Reorganized the input for bison a bit, to prepare for removal of
global YY_RESULT variables.

Got rid of the forward declaration for yyerror.

Entrypoints can be at the end of the file, since nothing depends on them.
This is more robust than using global variables, prepares for making
parser reentrant.
Parser.h defined things that are only for internal use in the lexer.
These can be automatically defined by bison using the %defines pragma.
Phew! That took me a whole day to figure out, but in the end, there
aren't too many changes.

UPDATE: also use yyextra argument in lexer instead of global variable
literal_buffer.
@andreasabel andreasabel added this to the 2.9.2 milestone Apr 2, 2021
@andreasabel andreasabel merged commit cd4c939 into master Apr 2, 2021
@andreasabel andreasabel deleted the issue349 branch August 26, 2021 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ C lexer Concerning the generated lexer parser Issues concerning parser generating
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make C parser reentrant
2 participants