vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 14.4k
Star 73.3k

Code
Issues 1.7k
Pull requests 2k
Discussions
Actions
Projects
Security 34
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 52 Milestones 2

New pull request New

2,005 Open 20,449 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[LoRA] Add LoRA support for Qwen3OmniMoeThinkerForConditionalGeneration qwen

Related to Qwen models

#37193 opened Mar 16, 2026 by pratapyash

Loading…

4 tasks done

WIP: [Feature] KVCACHE NVFP4 v1

#37192 opened Mar 16, 2026 by JartX

Loading…

[Feature][Offload] Add dynamic MoE expert LRU cache (--moe-expert-cache-size) ci/build frontend

#37190 opened Mar 16, 2026 by e1n00r • Draft

[ROCm] Add torch.cuda fallback for amdsmi-dependent methods on WSL2 nvidia rocm

Related to AMD ROCm

#37189 opened Mar 16, 2026 by JoursBleu • Draft

[Performance] Enable Triton autotuning disk cache by default

#37188 opened Mar 16, 2026 by arpera

Loading…

2 of 5 tasks

[Tool Parser] Qwen3Coder: incremental string streaming, trailing newline fix, and whitespace content filter qwen

Related to Qwen models

#37187 opened Mar 16, 2026 by ec-jt

Loading…

3 of 5 tasks

[Pixtral] Enable Pixtral language model support Eagle3

#37182 opened Mar 16, 2026 by Flechman

Loading…

Add ability to replace oot ops when using lora

#37181 opened Mar 16, 2026 by kyuyeunk

Loading…

5 tasks

[XPU] skip unsupported ut and update test_nixl_connector ci/build kv-connector v1

#37179 opened Mar 16, 2026 by zhenwei-intel

Loading…

5 tasks

Bugfix for offloading+prefetch for GLM-4.7-FP8 bug

Something isn't working

#37178 opened Mar 16, 2026 by sfbemerk

Loading…

Fix KV cache memory estimation for hybrid Mamba/Attention models v1

#37177 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

Fix KV cache size estimation regression in v0.17+ v1

#37172 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

[Frontend] feat: add streaming support for token generation endpoint frontend

#37171 opened Mar 16, 2026 by hhk7734

Loading…

3 of 5 tasks

[Bugfix] Fix prompt_embeds precision divergence with MTP speculative … bug

Something isn't working

speculative-decoding v1

#37170 opened Mar 16, 2026 by leihuang-sketch

Loading…

5 tasks

[Bugfix] Fix "Already borrowed" tokenizer race in Hermes tool parser bug

Something isn't working

#37169 opened Mar 16, 2026 by stonelazy

Loading…

Fix issue #37103: Remove shape mismatch warnings in FLA operations

#37166 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

[perf][connector] optimize build_connector_meta when host buffer transfer is not used kv-connector

#37165 opened Mar 16, 2026 by youkaichao

Loading…

5 tasks

[Bugfix] Fix TOCTOU race in KV block allocator causing prefix-cache block theft bug

Something isn't working

needs-rebase v1

#37164 opened Mar 16, 2026 by AbhiOnGithub

Loading…

Fix issue #37103: Remove shape mismatch warnings in FLA operations

#37163 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

Fix issue #37103: Remove shape mismatch warnings in FLA operations

#37161 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

[Feat][v1] Simple yet General CPU KV Cache Offloading kv-connector v1

#37160 opened Mar 16, 2026 by ivanium • Draft

5 tasks

[Bugfix] Fix mock.patch resolution failure for standalone_compile.FakeTensorMode on Python <= 3.10 bug

Something isn't working

#37158 opened Mar 16, 2026 by dbari

Loading…

3 of 5 tasks

[openapi] remove redundant exception stack trace[4/N] frontend

#37157 opened Mar 16, 2026 by andyxning

Loading…

5 tasks

Fix issue #37037 v1

#37156 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

Fix reasoning parser CI failure for seedoss and glm4 moe

#37154 opened Mar 16, 2026 by xueliangyang-oeuler

Loading…

5 tasks

Previous 1 2 3 4 5 … 80 81 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!