Skip to content

Conversation

k50112113
Copy link
Contributor

this PR adds 2 fused triton kernels:

  1. fused_gemm_a8w8_blockscale_a16w16
  2. fused_reduce_act_mul_fp8_group_quant

These two kernels are designed for fused shared experts for DSV3 on vLLM

same changes are PRed to 355_wip as well: #1217

documentation, fix some bugs, UT
@k50112113 k50112113 force-pushed the shaoclee/355_wip_triton_fused_shared_expert branch from ba165a5 to 99fda8d Compare October 22, 2025 22:39
@k50112113 k50112113 merged commit c0e8d91 into 355_wip_triton Oct 22, 2025
3 of 5 checks passed
@k50112113 k50112113 deleted the shaoclee/355_wip_triton_fused_shared_expert branch October 22, 2025 22:40
k50112113 added a commit that referenced this pull request Oct 23, 2025
documentation, fix some bugs, UT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant