Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[V0 Deprecation][KVConnector] Remove KVConnector v1/v0 differentiation ci/build documentation Improvements or additions to documentation kv-connector tpu Related to Google TPUs v1
#25376 opened Sep 22, 2025 by NickLucche Loading…
[Misc] Remove unused encoder-decoder error strings ready ONLY add when PR is ready to merge/full CI is needed
#25374 opened Sep 22, 2025 by DarkLight1337 Loading…
5 tasks
[Model] Support multi-vector retrieval documentation Improvements or additions to documentation qwen Related to Qwen models
#25370 opened Sep 22, 2025 by noooop Draft
5 tasks
[Docs] Fix griffe warnings in vllm/lora/ops
#25369 opened Sep 22, 2025 by windsonsea Loading…
[Docs] wheel larger than limit documentation Improvements or additions to documentation
#25367 opened Sep 22, 2025 by pfk-beta Loading…
[Bugfix] Qwen3-next generate ! always qwen Related to Qwen models
#25365 opened Sep 22, 2025 by yych0745 Loading…
5 tasks
[Core] Enable KV cache connector + hybrid allocator kv-connector tpu Related to Google TPUs v1
#25363 opened Sep 22, 2025 by KuntaiDu Loading…
5 tasks
[wip] Fix OOM in vLLM replicas by ensuring consistent NCCL memory accounting ready ONLY add when PR is ready to merge/full CI is needed v1
#25359 opened Sep 22, 2025 by kouroshHakha Draft
Flex attention hybrid allocator compatibility #25256 v1
#25358 opened Sep 21, 2025 by baonudesifeizhai Loading…
1 of 5 tasks
[Feature] OTEL Tracing for Individual Model Steps documentation Improvements or additions to documentation v1
#25356 opened Sep 21, 2025 by tomasruizt Loading…
[Bugfix][Model] Fix inference for Hunyuan dense models
#25354 opened Sep 21, 2025 by Anionex Loading…
3 of 5 tasks
Use macro guard CUDA functions for back compatibility in grouped_topk_kernel.cu ci/build ready ONLY add when PR is ready to merge/full CI is needed
#25346 opened Sep 21, 2025 by minosfuture Loading…
[V0 Deprecation][Models] Remove all V0 condition for mm embeddings merge deepseek Related to DeepSeek models llama Related to Llama models qwen Related to Qwen models
#25331 opened Sep 21, 2025 by Isotr0py Loading…
1 of 5 tasks
Add-new-encoder-models new-model Requests to new models
#25322 opened Sep 20, 2025 by hmellor Draft
Handle triton kernel import exception
#25319 opened Sep 20, 2025 by minosfuture Loading…
upgrade flashinfer to v0.4.0rc1 ci/build v1
#25315 opened Sep 20, 2025 by mmangkad Loading…
5 tasks
Make mypy behave like a proper pre-commit hook ci/build frontend needs-rebase ready ONLY add when PR is ready to merge/full CI is needed
#25313 opened Sep 20, 2025 by hmellor Loading…
[Bugfix] Enable API server headless mode and scale-out using Dockerfile entrypoint frontend ready ONLY add when PR is ready to merge/full CI is needed v1
#25309 opened Sep 20, 2025 by DarkLight1337 Loading…
5 tasks
ProTip! Mix and match filters to narrow down what you’re looking for.