-
Notifications
You must be signed in to change notification settings - Fork 20
Add CB API tests on the correct use of max_tokens #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Gabriel Marinho <[email protected]>
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Or this can be done with
Now you are good to go 🚀 |
Signed-off-by: Gabriel Marinho <[email protected]>
Signed-off-by: Gabriel Marinho <[email protected]>
Signed-off-by: Gabriel Marinho <[email protected]>
Signed-off-by: Gabriel Marinho <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick review on code lint failures
Signed-off-by: Gabriel Marinho <[email protected]>
@pytest.mark.parametrize("max_num_seqs", [2]) | ||
@pytest.mark.parametrize( | ||
"backend", [pytest.param("eager", marks=pytest.mark.cpu, id="eager")]) | ||
def test__api_cb_rejects_oversized_request( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that an extra _
between test
and api
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, does that prevent it from running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it did run successfully - https://github.com/vllm-project/vllm-spyre/actions/runs/16602263437/job/46964891318?pr=339#step:11:47
Just a formatting issue
Description
Add tests to check if the API is generating the correct amount of tokens when CB is enabled.
Related Issues