Skip to content

Conversation

maxdebayser
Copy link
Contributor

@maxdebayser maxdebayser commented Aug 22, 2025

Continuation of PR #23302

Summary

  • drop pooling model runner and related code paths in V0 worker
  • simplify V0 engine to handle only sampling requests
  • add stubs that raise for pooling model entry points

@DarkLight1337 , @noooop , do you remember anything else to remove?

@mergify mergify bot added frontend multi-modality Related to multi-modality (#4194) labels Aug 22, 2025
@maxdebayser maxdebayser changed the title Codex/remove pooling model support in vllm v0 [V0 Deprecation] Remove pooling model support in V0 Aug 22, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively removes support for pooling models in the vLLM V0 worker, which aligns with the stated objective. The changes are comprehensive, touching upon the engine, worker, sequence management, and tests to eliminate the V0 pooling code paths. Key modifications include the removal of PoolingModelRunner and V0-specific pooling metadata, updating LLMEngine and AsyncLLMEngine to handle only sampling requests, and stubbing out pooling-related entry points to raise NotImplementedError. The removal of token_type_ids and related logic is also consistent with this goal. The code modifications appear to be correct and consistently applied across the repository. I have not found any issues of high or critical severity.

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm aware of.

@noooop
Copy link
Contributor

noooop commented Aug 22, 2025

I think it’s quite difficult to clean it up all at once. It might take some time to completely remove all the “compatibility code”.


I hope this PR landing as soon as possible, so we can focus our optimization efforts on a single engine.

Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
@maxdebayser
Copy link
Contributor Author

I'm running into weird typing issues. Why is EngineClient.generate() not async if in classes that implement this interface the method is async (see AssyncLLMEngine and AssyncLLM ) . And why is it called without await?

@DarkLight1337
Copy link
Member

cc @robertgshaw2-redhat

@noooop
Copy link
Contributor

noooop commented Aug 23, 2025

@maxdebayser

Remove all run_with_both_engines in vllm/tests/models/language/pooling and vllm/tests/entrypoints/


Let's land this PR quickly, as long as ci turns green, then polish it in subsequent PRs.

@noooop
Copy link
Contributor

noooop commented Aug 25, 2025

clean up v0_only in tests/models/registry.py

Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
Signed-off-by: Max de Bayser <[email protected]>
@maxdebayser
Copy link
Contributor Author

@DarkLight1337 , the basic tests are passing now. Can you enable the full test suite?

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 26, 2025
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can merge if tests pass

@maxdebayser maxdebayser force-pushed the codex/remove-pooling-model-support-in-vllm-v0 branch from 3341ce9 to 73f0897 Compare August 27, 2025 16:56
@noooop
Copy link
Contributor

noooop commented Aug 29, 2025

@DarkLight1337

Is the failed test caused by a flaky test?

@DarkLight1337
Copy link
Member

Yes

@vllm-bot vllm-bot merged commit 2554b27 into vllm-project:main Aug 29, 2025
44 of 47 checks passed
@maxdebayser
Copy link
Contributor Author

Follow-up issue: #23883

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants