mirror of
https://github.com/handsomezhuzhu/vllm-npu-plugin.git
synced 2026-02-20 11:42:30 +00:00
- NPUPlatform: device management, HCCL process group, config adaptation - AscendAttentionBackend: npu_fusion_attention (prefill) + npu_incre_flash_attention (decode) - NPUCommunicator: HCCL-based distributed communication - NPUWorker: NPU device init, memory profiling - Custom ops: SiluAndMul, RMS norm, rotary embedding - Plugin registered via vllm.platform_plugins entry point Based on vllm-ascend official pattern, targeting Ascend 910B
23 lines
163 B
Plaintext
23 lines
163 B
Plaintext
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
*$py.class
|
|
*.egg-info/
|
|
dist/
|
|
build/
|
|
*.egg
|
|
|
|
# Virtual env
|
|
.venv/
|
|
venv/
|
|
|
|
# IDE
|
|
.vscode/
|
|
.idea/
|
|
*.swp
|
|
*.swo
|
|
|
|
# OS
|
|
.DS_Store
|
|
Thumbs.db
|