-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Kernel] Use self.kv_cache and forward_context.attn_metadata in Attention.forward
#12536
opened Jan 29, 2025 by
heheda12345
Loading…
[WIP][AMD][Kernel][Quantization] Add fp8 and int8 support for Triton FAv2 kernel
documentation
Improvements or additions to documentation
[RFC][vllm-API] Support tokenizer registry for customized tokenizer in vLLM
#12518
opened Jan 28, 2025 by
youngkent
Loading…
Fix: Respect
sparsity_config.ignore
in Cutlass Integration
#12517
opened Jan 28, 2025 by
rahul-tuli
Loading…
[Misc] Raise error when flashinfer is not installed and
VLLM_ATTENTION_BACKEND
is set
#12513
opened Jan 28, 2025 by
NickLucche
Loading…
[ROCm] [Feature] [Doc] [Dockerfile] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing
ci/build
documentation
Improvements or additions to documentation
rocm
#12501
opened Jan 28, 2025 by
tjtanaa
Loading…
[Bugfix] Fix Deepseek V3 Crash When max_num_batched_tokens is Very Large
#12491
opened Jan 28, 2025 by
Concurrensee
Loading…
[BUGFIX] Skip tokenization support for throughtput benchmark
#12489
opened Jan 28, 2025 by
maleksan85
Loading…
[Core] Handle conflicting features more gracefully
frontend
#12484
opened Jan 27, 2025 by
russellb
Loading…
[CI/Build] Better default num jobs heuristic
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12477
opened Jan 27, 2025 by
LucasWilkinson
Loading…
Bump helm/chart-testing-action from 2.6.1 to 2.7.0
ci/build
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#12463
opened Jan 27, 2025 by
dependabot
bot
Loading…
Bump actions/stale from 9.0.0 to 9.1.0
ci/build
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#12462
opened Jan 27, 2025 by
dependabot
bot
Loading…
whisper-async working poc
frontend
needs-rebase
#12458
opened Jan 26, 2025 by
robertgshaw2-redhat
•
Draft
[Misc] Separate hf dataset sampling function from benchmark_serving.py
#12447
opened Jan 26, 2025 by
Isotr0py
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.