vllm-npu-plugin/vllm_npu at 7120cd803b4fda8c03cf96a168e4d93a24487c1b - vllm-npu-plugin - Gitea: Git with a cup of tea

zzh/vllm-npu-plugin

mirror of https://github.com/handsomezhuzhu/vllm-npu-plugin.git synced 2026-04-18 22:32:53 +00:00

Files

History

handsomezhuzhu 7120cd803b fix: KV cache shape needs leading 2 dim for key+value pair

2026-02-10 19:27:10 +08:00

..

fix: KV cache shape needs leading 2 dim for key+value pair

2026-02-10 19:27:10 +08:00

feat: initial vllm-npu-plugin for Ascend NPU adaptation

2026-02-10 11:06:01 +08:00

feat: initial vllm-npu-plugin for Ascend NPU adaptation

2026-02-10 11:06:01 +08:00

fix: initialize TP/PP parallel groups after distributed environment

2026-02-10 19:14:29 +08:00

__init__.py

feat: add CUDA-to-NPU monkey patches for GPUModelRunner compatibility

2026-02-10 19:09:14 +08:00

cuda_compat.py

feat: add CUDA-to-NPU monkey patches for GPUModelRunner compatibility

2026-02-10 19:09:14 +08:00

platform.py

feat: initial vllm-npu-plugin for Ascend NPU adaptation

2026-02-10 11:06:01 +08:00