[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1 #14554

chaunceyjiang · 2025-03-10T11:33:16Z

vllm/vllm/model_executor/models/llava.py

Lines 386 to 395 in 5d80252

    
           num_crops = hf_inputs.get("num_crops", torch.empty(0)).view(-1) 
        
           return dict( 
        
               feat_is_patch=MultiModalFieldConfig.flat_from_sizes( 
        
                   "image", num_crops), 
        
               embed_is_patch=MultiModalFieldConfig.flat_from_sizes( 
        
                   "image", num_crops), 
        
               num_crops=MultiModalFieldConfig.batched("image"), 
        
               pixel_values=MultiModalFieldConfig.batched("image"), 
        
               image_embeds=MultiModalFieldConfig.batched("image"), 
        
           )

The root cause of the issue is that PR #14275 introduced additional parameters feat_is_patch and embed_is_patch for PixtralHFMultiModalProcessor, but these two parameters are not present in the LlavaMultiModalProcessor.

vllm/vllm/model_executor/models/llava.py

Lines 298 to 310 in 5d80252

    
           class LlavaMultiModalProcessor( 
        
                   BaseLlavaMultiModalProcessor[LlavaProcessingInfo]): 
        
               def _get_mm_fields_config( 
        
                   self, 
        
                   hf_inputs: BatchFeature, 
        
                   hf_processor_mm_kwargs: Mapping[str, object], 
        
               ) -> Mapping[str, MultiModalFieldConfig]: 
        
                   return dict( 
        
                       pixel_values=MultiModalFieldConfig.batched("image"), 
        
                       image_embeds=MultiModalFieldConfig.batched("image"), 
        
                   )

github-actions · 2025-03-10T11:33:30Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2025-03-10T11:47:18Z

Test

# VLLM_USE_V1=1 python3 examples/offline_inference/vision_language.py -m llava
...
...
INFO 03-10 11:29:52 [gpu_model_runner.py:1416] Graph capturing finished in 20 secs, took 1.42 GiB
INFO 03-10 11:29:52 [core.py:120] init engine (profile, create kv cache, warmup model) took 42.62 seconds
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  4.44it/s, est. speed input: 2646.97 toks/s, output: 284.24 toks/s]
 The image features a tall tower with a spire, surrounded by a beautiful flowering tree. The tree is filled with pink flowers, creating a picturesque scene. The tower stands tall in the background, with the tree's branches extending towards it. The combination of the tower and the flowering tree creates a
 The image features a tall tower with a spire, surrounded by a forest of pink flowers. The tower stands tall amidst the vibrant blossoms, creating a picturesque scene. The tower's height and the abundance of flowers create a sense of grandeur and beauty in the landscape.
 The image features a tall tower with a spire, surrounded by a beautiful cherry blossom tree. The tree is filled with pink flowers, creating a stunning contrast against the tower's structure. The tower stands tall in the background, with the tree's branches extending towards it. The scene capt
 The image features a tall building with a spire, surrounded by a beautiful flowering tree filled with pink flowers. The tree is located in front of the building, creating a striking contrast between the architectural structure and the natural beauty of the flowers. The scene captures the essence of harmony between urban architecture

chaunceyjiang · 2025-03-10T11:49:06Z

@DarkLight1337 @lk-chen PTAL.

DarkLight1337 · 2025-03-10T11:52:00Z

~~Can you add a note in the code explaining this?~~ Actually let me just add this directly

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337

Thanks for fixing!

chaunceyjiang · 2025-03-10T14:20:58Z

Actually let me just add this directly

Thanks~

lk-chen · 2025-03-10T16:37:50Z

Thanks for the fix!
LGTM

robertgshaw2-redhat · 2025-03-10T18:32:21Z

Thanks!

…ject#14554) Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>

…ject#14554) Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: DarkLight1337 <[email protected]>

[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1

a0f91ce

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang force-pushed the llava branch from e0454a1 to a0f91ce Compare March 10, 2025 11:41

Add note

da48540

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 approved these changes Mar 10, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) March 10, 2025 14:06

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 10, 2025

DarkLight1337 merged commit 92b0ce2 into vllm-project:main Mar 10, 2025
38 of 47 checks passed

chaunceyjiang deleted the llava branch March 11, 2025 03:06

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1 #14554

[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1 #14554

Uh oh!

chaunceyjiang commented Mar 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 10, 2025

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

DarkLight1337 commented Mar 10, 2025 •

edited

Loading

Uh oh!

DarkLight1337 left a comment

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

lk-chen commented Mar 10, 2025

Uh oh!

Uh oh!

robertgshaw2-redhat commented Mar 10, 2025

Uh oh!

Uh oh!

	num_crops = hf_inputs.get("num_crops", torch.empty(0)).view(-1)
	return dict(
	feat_is_patch=MultiModalFieldConfig.flat_from_sizes(
	"image", num_crops),
	embed_is_patch=MultiModalFieldConfig.flat_from_sizes(
	"image", num_crops),
	num_crops=MultiModalFieldConfig.batched("image"),
	pixel_values=MultiModalFieldConfig.batched("image"),
	image_embeds=MultiModalFieldConfig.batched("image"),
	)

	class LlavaMultiModalProcessor(
	BaseLlavaMultiModalProcessor[LlavaProcessingInfo]):

	def _get_mm_fields_config(
	self,
	hf_inputs: BatchFeature,
	hf_processor_mm_kwargs: Mapping[str, object],
	) -> Mapping[str, MultiModalFieldConfig]:
	return dict(
	pixel_values=MultiModalFieldConfig.batched("image"),
	image_embeds=MultiModalFieldConfig.batched("image"),
	)

Uh oh!

[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1 #14554

[Bugfix][v1] fixed llava-hf/llava-1.5-7b-hf is broken on V1 #14554

Uh oh!

Conversation

chaunceyjiang commented Mar 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 10, 2025

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

DarkLight1337 commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Mar 10, 2025

Uh oh!

lk-chen commented Mar 10, 2025

Uh oh!

Uh oh!

robertgshaw2-redhat commented Mar 10, 2025

Uh oh!

Uh oh!

chaunceyjiang commented Mar 10, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Mar 10, 2025 •

edited

Loading