-
-
Notifications
You must be signed in to change notification settings - Fork 9.2k
[Benchmark] Add --async-engine
option to benchmark_throughput.py
#7964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Benchmark] Add --async-engine
option to benchmark_throughput.py
#7964
Conversation
Uses AsyncLLMEngine interface (or rather than LLM) Will use decoupled front-end depending on whether or not --disable-frontend-multiprocessing is also specified.
a7a6e43
to
bd5ba81
Compare
--async-engine
option to benchmark_throughput.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a nit
if args.async_engine: | ||
run_args.append(args.disable_frontend_multiprocessing) | ||
elapsed_time = uvloop.run(run_vllm_async(*run_args)) | ||
else: | ||
elapsed_time = run_vllm(*run_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: better to use kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but the list was already passed as regular args so this involved minimal changes.
…llm-project#7964) Signed-off-by: Alvant <[email protected]>
…llm-project#7964) Signed-off-by: LeiWang1999 <[email protected]>
Uses AsyncLLMEngine interface (rather than LLM)
Will use decoupled front-end depending on whether or not
--disable-frontend-multiprocessing
is also specified.