clean up activation hardcoding to silu #28503

cyrusd98 · 2025-11-12T01:15:08Z

Summary: GPT OSS has a different activation function from Llama4. This diff removes activation function hardcoding to allow general fused moe support.

Test Plan: CI Passes

Differential Revision: D86720793

Summary: No functional changes Test Plan: CI Passes Differential Revision: D86799190

gemini-code-assist

Code Review

This pull request aims to remove the hardcoded 'silu' activation in the MoE expert implementation to support more general activation functions. The changes are in the right direction, but I've identified a critical issue in torch_moe where a call to the updated torch_experts function is now incorrect, which will lead to a runtime error. I've provided a fix for this. Additionally, I've suggested a small performance optimization in torch_experts to avoid redundant function calls inside a loop.

gemini-code-assist · 2025-11-12T01:16:54Z

tests/kernels/utils.py

+    def apply_moe_activation(act_str, act_input):
+        act_fn = get_act_and_mul_fn(act_str)
+        return act_fn(act_input)


The apply_moe_activation helper function repeatedly calls get_act_and_mul_fn inside the loop over experts. This is inefficient. It's better to fetch the activation function once before the loop.

I suggest removing this helper function and instead initializing the activation function before the loop:

act_fn = get_act_and_mul_fn(activation) for i in range(num_experts): mask = topk_ids == i if mask.sum(): if quant_dtype is None: # ... tmp2 = act_fn(tmp1) # ...

This change should be applied to all calls to apply_moe_activation within the loop.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-11-12T01:17:22Z

tests/kernels/utils.py

        topk_weight,
        topk_ids,
        global_num_experts,
-        b_bias1,
-        b_bias2,
+        w1_bias,
+        w2_bias,
        expert_map,
    )


Provide activation argument when calling torch_experts

The new activation parameter in torch_experts is required but torch_moe still calls the helper with the old positional layout. As written the integer global_num_experts is passed into get_act_and_mul_fn as the activation name and the bias tensors are shifted into the wrong parameters, so any test that reaches this code will raise before exercising the kernel. Please either give activation a default (e.g., "silu") or add an activation argument to torch_moe and forward it correctly.

Useful? React with 👍 / 👎.

Summary: Pull Request resolved: vllm-project#28503 GPT OSS has a different activation function from Llama4. This diff removes activation function hardcoding to allow general fused moe support. Test Plan: CI Passes Differential Revision: D86720793

Add bias tests for GPT OSS and standardize bias params naming

40e6f1f

Summary: No functional changes Test Plan: CI Passes Differential Revision: D86799190

cyrusd98 requested review from WoosukKwon, mgoin, tlrmchlsmth and yewentao256 as code owners November 12, 2025 01:15

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Nov 12, 2025

View reviewed changes

cyrusd98 force-pushed the export-D86720793 branch from 42d2563 to a5dcebd Compare November 12, 2025 01:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

clean up activation hardcoding to silu #28503

clean up activation hardcoding to silu #28503

Uh oh!

cyrusd98 commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

clean up activation hardcoding to silu #28503

Are you sure you want to change the base?

clean up activation hardcoding to silu #28503

Uh oh!

Conversation

cyrusd98 commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant