This website requires JavaScript.
Explore
Help
Register
Sign In
zzh
/
vllm-npu-plugin
Watch
1
Star
0
Fork
0
You've already forked vllm-npu-plugin
mirror of
https://github.com/handsomezhuzhu/vllm-npu-plugin.git
synced
2026-02-20 19:50:15 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
37af1ddc1ff26978f6638bdd7934e97680dc94fd
vllm-npu-plugin
/
vllm_npu
History
handsomezhuzhu
37af1ddc1f
fix: use npu_fusion_attention loop (BSND) for prefill_no_cache to fix crash
2026-02-10 20:42:47 +08:00
..
attention
fix: use npu_fusion_attention loop (BSND) for prefill_no_cache to fix crash
2026-02-10 20:42:47 +08:00
distributed
feat: initial vllm-npu-plugin for Ascend NPU adaptation
2026-02-10 11:06:01 +08:00
ops
feat: initial vllm-npu-plugin for Ascend NPU adaptation
2026-02-10 11:06:01 +08:00
worker
fix: add initialize_cache method to NPU worker
2026-02-10 19:42:32 +08:00
__init__.py
feat: add CUDA-to-NPU monkey patches for GPUModelRunner compatibility
2026-02-10 19:09:14 +08:00
cuda_compat.py
feat: add CUDA-to-NPU monkey patches for GPUModelRunner compatibility
2026-02-10 19:09:14 +08:00
platform.py
feat: initial vllm-npu-plugin for Ascend NPU adaptation
2026-02-10 11:06:01 +08:00