-
Notifications
You must be signed in to change notification settings - Fork 214
Pull requests: alibaba/rtp-llm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(cache): migrate DSV4 KV cache infrastructure from feat/dsv4_on_d…
#1103
opened Jun 15, 2026 by
Adrenaline-S
Collaborator
•
Draft
[ROCm] Prefill performance optimization for embedding models
#1102
opened Jun 15, 2026 by
liaocz
Collaborator
Loading…
fix: size flashinfer prefill workspace dynamically
#1099
opened Jun 15, 2026 by
Vinkle-hzt
Collaborator
Loading…
fix(moe): avoid cross-warp stale read in ep_scatter_1 prefix sum
#1098
opened Jun 15, 2026 by
HoniiTro19
Loading…
feat(online_optimizer) support theoretical hit rate statistics in flexlb
#1095
opened Jun 12, 2026 by
YoungRX
Collaborator
Loading…
perf: remove redundent kv cache update for finished stream
#1092
opened Jun 11, 2026 by
zhangjianning-zjn
Collaborator
Loading…
fix: fix 1s stall during polling output of generate stream
#1091
opened Jun 11, 2026 by
zhangjianning-zjn
Collaborator
Loading…
fix: fix metric reporting on waiting time of generate stream
#1090
opened Jun 11, 2026 by
zhangjianning-zjn
Collaborator
Loading…
perf(rocm): enable FlyDSL fused MoE for MI308X Qwen3.5 decode
#1087
opened Jun 11, 2026 by
chengshu-lcc
Collaborator
Loading…
feat(rocm): support Qwen3/3.5 VL model on ROCm
#1086
opened Jun 11, 2026 by
liaocz
Collaborator
Loading…
feat(moe): fuse shared expert into MoE kernel for ROCm TP-only mode
#1085
opened Jun 11, 2026 by
chengshu-lcc
Collaborator
Loading…
feat: add prompt scoring (per-position logits for input sequences)
#1081
opened Jun 10, 2026 by
theNiemand
Collaborator
Loading…
feat(xpu): Python device abstraction and server integration (3/4)
#1075
opened Jun 8, 2026 by
aslanxie
Loading…
feat(omni): add Qwen2.5-Omni multi-stage pipeline support
#1074
opened Jun 7, 2026 by
stmatengss
Loading…
1 of 3 tasks
feat(new_weight_loader): introduce new weight loader framework with FP8 quantization support for Qwen2
#1072
opened Jun 5, 2026 by
Oneydauh
Loading…
feat(xpu): C++ device generalization for Intel XPU support (2/4)
#1071
opened Jun 5, 2026 by
aslanxie
Loading…
feat(p2p): decode_entrance P2P support with lease-based race fix
#1067
opened Jun 4, 2026 by
ZhihanYan
Collaborator
Loading…
7 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2026-06-12.