-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V0 Deprecation][KVConnector] Remove KVConnector v1/v0 differentiation
ci/build
documentation
Improvements or additions to documentation
kv-connector
tpu
Related to Google TPUs
v1
#25376
opened Sep 22, 2025 by
NickLucche
Loading…
[Misc] Remove unused encoder-decoder error strings
ready
ONLY add when PR is ready to merge/full CI is needed
#25374
opened Sep 22, 2025 by
DarkLight1337
Loading…
5 tasks
[Model] Support multi-vector retrieval
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
[Docs] wheel larger than limit
documentation
Improvements or additions to documentation
#25367
opened Sep 22, 2025 by
pfk-beta
Loading…
[Bugfix] Qwen3-next generate ! always
qwen
Related to Qwen models
#25365
opened Sep 22, 2025 by
yych0745
Loading…
5 tasks
[Core] Enable KV cache connector + hybrid allocator
kv-connector
tpu
Related to Google TPUs
v1
#25363
opened Sep 22, 2025 by
KuntaiDu
Loading…
5 tasks
[wip] Fix OOM in vLLM replicas by ensuring consistent NCCL memory accounting
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#25359
opened Sep 22, 2025 by
kouroshHakha
•
Draft
Flex attention hybrid allocator compatibility #25256
v1
#25358
opened Sep 21, 2025 by
baonudesifeizhai
Loading…
1 of 5 tasks
[Feature] OTEL Tracing for Individual Model Steps
documentation
Improvements or additions to documentation
v1
#25356
opened Sep 21, 2025 by
tomasruizt
Loading…
[Bugfix] Improve GLM4 MoE Reasoning Parser's is_reasoning_end Condition
#25355
opened Sep 21, 2025 by
frankwang28
Loading…
5 tasks done
[Bugfix][Model] Fix inference for Hunyuan dense models
#25354
opened Sep 21, 2025 by
Anionex
Loading…
3 of 5 tasks
Use macro guard CUDA functions for back compatibility in grouped_topk_kernel.cu
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#25346
opened Sep 21, 2025 by
minosfuture
Loading…
[Feature] Allow configuring FlashInfer workspace size (#25342)
v1
#25344
opened Sep 21, 2025 by
mishra-krishna
Loading…
[Feature] Improve GGUF loading from HuggingFace user experience
needs-rebase
#25340
opened Sep 21, 2025 by
simondanielsson
•
Draft
3 of 9 tasks
Make ONLY add when PR is ready to merge/full CI is needed
mypy
behave like a proper pre-commit hook
ci/build
frontend
needs-rebase
ready
#25313
opened Sep 20, 2025 by
hmellor
Loading…
[BugFix] Support EP/DP + EPLB with DeepSeek MTP
deepseek
Related to DeepSeek models
needs-rebase
speculative-decoding
v1
[Bugfix] Enable API server headless mode and scale-out using Dockerfile entrypoint
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#25309
opened Sep 20, 2025 by
DarkLight1337
Loading…
5 tasks
[torch.compile][Minor Fix] Gate cudagraph_unsafe tag for torch>=2.9
#25304
opened Sep 20, 2025 by
BoyuanFeng
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.