-
Notifications
You must be signed in to change notification settings - Fork 315
[Patch] Remove spec_decode.metrics
patch
#1016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
the pr number in commit msg seems wrong? pls update, and update commit msg completely, especially, note the existing or new added test |
The cuda hard code of |
73999f0
to
2e7dbc1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if CI passed
approve, good change. |
Signed-off-by: shen-shanshan <[email protected]>
@wangxiyuan The CI is finally passed and this PR can be merged. |
Thanks for the clean up |
What this PR does / why we need it?
Remove
spec_decode.metrics
patch as this has been resolved in vllm-project/vllm#16983 (include in vllmv0.9.0
).Returns a CUDA event recording when the copy is complete --after modified--> Returns a device event (NPU Event for vllm-ascend) recording when the copy is complete.
Does this PR introduce any user-facing change?
How was this patch tested?