Skip to content

Conversation

chaunceyjiang
Copy link
Collaborator

@chaunceyjiang chaunceyjiang commented Mar 10, 2025

FIX #14523

num_crops = hf_inputs.get("num_crops", torch.empty(0)).view(-1)
return dict(
feat_is_patch=MultiModalFieldConfig.flat_from_sizes(
"image", num_crops),
embed_is_patch=MultiModalFieldConfig.flat_from_sizes(
"image", num_crops),
num_crops=MultiModalFieldConfig.batched("image"),
pixel_values=MultiModalFieldConfig.batched("image"),
image_embeds=MultiModalFieldConfig.batched("image"),
)

The root cause of the issue is that PR #14275 introduced additional parameters feat_is_patch and embed_is_patch for PixtralHFMultiModalProcessor, but these two parameters are not present in the LlavaMultiModalProcessor.

class LlavaMultiModalProcessor(
BaseLlavaMultiModalProcessor[LlavaProcessingInfo]):
def _get_mm_fields_config(
self,
hf_inputs: BatchFeature,
hf_processor_mm_kwargs: Mapping[str, object],
) -> Mapping[str, MultiModalFieldConfig]:
return dict(
pixel_values=MultiModalFieldConfig.batched("image"),
image_embeds=MultiModalFieldConfig.batched("image"),
)

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@chaunceyjiang
Copy link
Collaborator Author

Test

# VLLM_USE_V1=1 python3 examples/offline_inference/vision_language.py -m llava
...
...
INFO 03-10 11:29:52 [gpu_model_runner.py:1416] Graph capturing finished in 20 secs, took 1.42 GiB
INFO 03-10 11:29:52 [core.py:120] init engine (profile, create kv cache, warmup model) took 42.62 seconds
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.44it/s, est. speed input: 2646.97 toks/s, output: 284.24 toks/s]
 The image features a tall tower with a spire, surrounded by a beautiful flowering tree. The tree is filled with pink flowers, creating a picturesque scene. The tower stands tall in the background, with the tree's branches extending towards it. The combination of the tower and the flowering tree creates a
 The image features a tall tower with a spire, surrounded by a forest of pink flowers. The tower stands tall amidst the vibrant blossoms, creating a picturesque scene. The tower's height and the abundance of flowers create a sense of grandeur and beauty in the landscape.
 The image features a tall tower with a spire, surrounded by a beautiful cherry blossom tree. The tree is filled with pink flowers, creating a stunning contrast against the tower's structure. The tower stands tall in the background, with the tree's branches extending towards it. The scene capt
 The image features a tall building with a spire, surrounded by a beautiful flowering tree filled with pink flowers. The tree is located in front of the building, creating a striking contrast between the architectural structure and the natural beauty of the flowers. The scene captures the essence of harmony between urban architecture

@chaunceyjiang
Copy link
Collaborator Author

@DarkLight1337 @lk-chen PTAL.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Mar 10, 2025

Can you add a note in the code explaining this? Actually let me just add this directly

Signed-off-by: DarkLight1337 <[email protected]>
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 10, 2025 14:06
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 10, 2025
@chaunceyjiang
Copy link
Collaborator Author

Actually let me just add this directly

Thanks~

@lk-chen
Copy link
Collaborator

lk-chen commented Mar 10, 2025

Thanks for the fix!
LGTM

@DarkLight1337 DarkLight1337 merged commit 92b0ce2 into vllm-project:main Mar 10, 2025
38 of 47 checks passed
@robertgshaw2-redhat
Copy link
Collaborator

Thanks!

@chaunceyjiang chaunceyjiang deleted the llava branch March 11, 2025 03:06
lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025
…ject#14554)

Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Co-authored-by: DarkLight1337 <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025
…ject#14554)

Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Co-authored-by: DarkLight1337 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: [V1] llava-hf/llava-1.5-7b-hf is broken on V1
4 participants