close
Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[CPU] Experimentally enable Triton and MRV2 ci/build cpu Related to CPU backends v1
#43225 opened May 20, 2026 by bigPYJ1151 Member Draft
4 tasks
Fix FlashInfer TRTLLM NvFP4 monolithic MoE routing nvidia
#43223 opened May 20, 2026 by zhangxin81 Contributor Loading…
[EPLB] Make async EPLB default ci/build documentation Improvements or additions to documentation
#43219 opened May 20, 2026 by ilmarkov Contributor Loading…
4 tasks
[ROCm] MoRI connector telemetry kv-connector rocm Related to AMD ROCm
#43218 opened May 20, 2026 by simondanielsson Contributor Draft
4 tasks
[Misc] Add exponential distribution to multi-turn benchmark performance Performance-related issues
#43217 opened May 20, 2026 by nikonyrh-siloai Loading…
4 tasks done
[Misc] Add --max-duration-sec to benchmark_serving_multi_turn.py performance Performance-related issues
#43215 opened May 20, 2026 by nikonyrh-siloai Loading…
3 of 4 tasks
[Model] Fix MiniCPM-V 4.6 vit_merger qkv weight loading
#43213 opened May 20, 2026 by tc-mb Contributor Loading…
[Bugfix] Fix multi-turn benchmark's sleep to match the configured request rate bug Something isn't working performance Performance-related issues
#43212 opened May 20, 2026 by nikonyrh-siloai Loading…
3 of 4 tasks
[Bugfix][Reasoning] Properly detect reasoning end when using thinking_token_budget bug Something isn't working v1
#43210 opened May 20, 2026 by schoennenbeck Contributor Loading…
[Docs] Add drain shutdown section to Kubernetes deployment guide documentation Improvements or additions to documentation
#43208 opened May 20, 2026 by markmc Member Loading…
[KV Offload] Add get_request_offloading_context lifecycle hook kv-connector v1
#43205 opened May 20, 2026 by ronensc Contributor Loading…
4 tasks
[Cleanup]Simplify UnitaryKVCacheCoordinator hash_block_size assert v1
#43204 opened May 20, 2026 by maang-h Contributor Loading…
[vLLM IR][Rope] Port RotaryEmbedding to IR Ops cpu Related to CPU backends intel-gpu Related to Intel GPU nvidia rocm Related to AMD ROCm
#43199 opened May 20, 2026 by wxsIcey Contributor Loading…
4 tasks
【Feature】Modify the fps parameter when loading the multimodal model Video. bug Something isn't working multi-modality Related to multi-modality (#4194)
#43198 opened May 20, 2026 by lucky-dep Loading…
[CI] De-flake test_models for bigscience/bloom-560m ready ONLY add when PR is ready to merge/full CI is needed
#43197 opened May 20, 2026 by haosdent Contributor Loading…
Update KDA chunk prefill decay to use exp2 semantics performance Performance-related issues verified Run pre-commit for new contributors without triggering other tests
#43195 opened May 20, 2026 by zexplorerhj Loading…
[Bugfix] fix device mismatch in MiniCPM-o-4_5 resampler bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed
#43194 opened May 20, 2026 by yma11 Contributor Loading…
4 tasks
Extend prefix-cache soft-pin with a popular-insert signal (follow-up to #42985) ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation kv-connector nvidia performance Performance-related issues v1
#43191 opened May 20, 2026 by manueldomke Loading…
1 of 4 tasks
ProTip! Updated in the last three days: updated:>2026-05-17.