-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Fix(async): Add support for truncate_prompt_tokens in AsyncLLM #23800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc : @DarkLight1337 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for truncate_prompt_tokens
in AsyncLLM
. The implementation in the generate
method can be simplified for better readability. More importantly, the implementation in the encode
method contains a critical bug where truncate_prompt_tokens
is ignored if tokenization_kwargs
is also provided. I have included review comments with code suggestions to address these issues.
@DarkLight1337 Do you think anything else needs do done on this issue ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this should be good to go!
But please fix pre-commit |
Head branch was pushed to by a user without write access
Fix async mode to make use of truncate_prompt_tokens param .
Fixes #23511