-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
Suggestion Description
Hi, I am testing Kimi-K2-Instruct (https://huggingface.co/moonshotai/Kimi-K2-Instruct/blob/main/config.json) TP8 solution on single MI355 node. I use Aiter attention backend replace triton backend for better performance.
it seems like only support DSV3/R1 nhead=16. https://github.com/ROCm/aiter/blob/v0.1.4/aiter/mla.py#L107
could we support nhead=8 in Aiter?
reproduce step:
image:
lmsysorg/sglang:v0.4.9.post2-rocm700-mi35x
launch server:
python3 -m sglang.launch_server --model moonshotai/Kimi-K2-Instruct --trust-remote-code --tp 8 --attention-backend aiter
Operating System
No response
GPU
No response
ROCm Component
No response
Metadata
Metadata
Assignees
Labels
No labels