-
-
Notifications
You must be signed in to change notification settings - Fork 9.2k
[Core] More fixes to MultiModalEmbeddings type handling #19715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @russellb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request is a follow-up to a previous change that modified get_multimodal_embeddings()
to always return a MultiModalEmbeddings
object, rather than an optional one. My changes update numerous model implementations to align with this new behavior, specifically by refining how the presence of multimodal embeddings is checked, ensuring robustness and consistency across the codebase.
Highlights
- Multimodal Embeddings Type Handling: This PR completes the necessary adjustments for
MultiModalEmbeddings
type handling across various model implementations. It addresses models that were not updated in a previous PR (#19446) due to insufficient CI coverage. - Code Consistency and Readability: The changes standardize conditional checks for
multimodal_embeddings
by replacing explicitis not None
andis None
comparisons with more Pythonic truthiness checks (if multimodal_embeddings:
andif not multimodal_embeddings:
). This improves code readability and consistency.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR consistently updates the checks for multimodal_embeddings
from is not None
/ is None
to direct boolean evaluation (if multimodal_embeddings
/ if not multimodal_embeddings
). This change is Pythonic and correctly handles the case where an empty collection (like an empty list []
) signifies the absence of embeddings, treating it as falsy (or truthy for if not
). This aligns with the described change in PR #19446 where get_multimodal_embeddings()
no longer returns Optional
.
A key point for consideration across all modified files is the type hint for the multimodal_embeddings
parameter in the get_input_embeddings
(and similar) methods. It's currently Optional[MultiModalEmbeddings] = None
(or Optional[NestedTensors]
). If this parameter is now guaranteed to be a MultiModalEmbeddings
(or NestedTensors
) instance and never None
(with empty lists/tuples representing no embeddings), the type hint should be updated to remove Optional
(and the default None
potentially removed or changed) to accurately reflect this contract. This would enhance code clarity and maintainability. I've added specific comments detailing this, which applies broadly to all affected files.
PTAL at the failing engine test |
vllm/model_executor/models/blip2.py
Outdated
@@ -641,7 +641,7 @@ def get_input_embeddings( | |||
multimodal_embeddings: Optional[MultiModalEmbeddings] = None, | |||
) -> torch.Tensor: | |||
inputs_embeds = self.language_model.get_input_embeddings(input_ids) | |||
if multimodal_embeddings is not None: | |||
if multimodal_embeddings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to potentially cause a
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
error?
Since this type is defined as
MultiModalEmbeddings = Union[list[Tensor], Tensor, tuple[Tensor, ...]]
the if clause could potentially test a torch.Tensor directly.
For example, in blip2, get_multimodal_embeddings()
calls _process_image_input
which in turn calls language_projection.forward
which will return a tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, one of the tests if failing for this reason:
FAILED engine/test_short_mm_context.py::test_context_length_too_short[llava-hf/llava-1.5-7b-hf] - RuntimeError: Boolean value of Tensor with more than one value is ambiguous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(multimodal_embeddings) != 0:
should work for lists, tuples and tensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm updating the checks to use len()
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -845,7 +846,7 @@ def get_input_embeddings_v0( | |||
multimodal_embeddings: Optional[NestedTensors] = None, | |||
) -> torch.Tensor: | |||
inputs_embeds = self.language_model.get_input_embeddings(input_ids) | |||
if multimodal_embeddings is None: | |||
if len(multimodal_embeddings) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about this line in the case when we use the default argument - shouldn't this be like the other lines? (Edited; I had a typo where I copied the "is not None" condition when it should check "is None")
if multimodal_embeddings is None or len(multimodal_embeddings) == 0:
If it should be, maybe it makes sense to make a single utility function to make sure they're all the same (check None and len == 0 in the function), or change the argument to NestedTensors
type here if we do not ever want to pass None
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks - i'll fix these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. i fixed all the spots that came up with git grep 'if len(multimodal_embeddings'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -845,7 +846,7 @@ def get_input_embeddings_v0( | |||
multimodal_embeddings: Optional[NestedTensors] = None, | |||
) -> torch.Tensor: | |||
inputs_embeds = self.language_model.get_input_embeddings(input_ids) | |||
if multimodal_embeddings is None: | |||
if len(multimodal_embeddings) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(multimodal_embeddings) == 0: | |
if not multimodal_embeddings: |
cc @russellb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarnphm , if the multimodal_embeddings is a torch.Tensor you get RuntimeError: Boolean value of Tensor with more than one value is ambiguous
if you try to get the truth value directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I used before, but doesn't work if it's a tensor
cc @maxdebayser who pointed this out and submitted the change to use len instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it
This is a follow up to PR vllm-project#19446. In that PR, get_multimodal_embeddings() was changed to return `MultiModalEmbeddings` instead of `Optional[MultiModalEmbeddings]` because code in the model runner was requiring that the result was not `None`. Several models needed tweaks to account for this. Many were missed because they were not tested in CI. This should fix the rest of the common changes needed that weren't caught by CI. Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: minpeter <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Yang Wang <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Will Eaton <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]>
…#19715) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: avigny <[email protected]>
FIX #19736
This is a follow up to PR #19446.
In that PR, get_multimodal_embeddings() was changed to return
MultiModalEmbeddings
instead ofOptional[MultiModalEmbeddings]
because code in the model runner was requiring that the result was not
None
.Several models needed tweaks to account for this. Many were missed
because they were not tested in CI. This should fix the rest of the
common changes needed that weren't caught by CI.
Signed-off-by: Russell Bryant [email protected]