[Fix] Adjust use_aclgraph logic #2156

yiz-liu · 2025-08-01T01:30:21Z

What this PR does / why we need it?

Updates the FusedMoE method to determine whether to use ACL Graph based on the torchair_graph_config

This is equivalent to #2154 on v0.9.1-dev.

Does this PR introduce any user-facing change?

None.

How was this patch tested?

None needed.

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@ad57f23

Signed-off-by: Yizhou Liu <[email protected]>

github-actions · 2025-08-01T02:54:50Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

codecov · 2025-08-01T10:23:58Z

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.67%. Comparing base (8cf97d8) to head (71246f6).
⚠️ Report is 12 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/common_fused_moe.py	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2156      +/-   ##
==========================================
- Coverage   76.67%   76.67%   -0.01%     
==========================================
  Files         107      107              
  Lines       11968    11972       +4     
==========================================
+ Hits         9177     9180       +3     
- Misses       2791     2792       +1

Flag	Coverage Δ
unittests	`76.67% <83.33%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ApsarasX · 2025-08-02T03:07:50Z

vllm_ascend/ops/common_fused_moe.py

@@ -33,7 +34,15 @@ def unquantized_fused_moe_init_func(self, *args, **kwargs):
    original_unquantized_fused_moe_init_func(self, *args, **kwargs)
    vllm_config = get_current_vllm_config()
    self.max_num_batched_tokens = vllm_config.scheduler_config.max_num_batched_tokens
-    self.use_aclgraph = vllm_config.compilation_config.level == CompilationLevel.PIECEWISE and not vllm_config.model_config.enforce_eager
+
+    ascend_config = get_ascend_config()


I suggest referring to #1841 and adding a new class called AscendUnquantizedFusedMoEMethod, incorporating the ACLGraph switch into this class's __init__ method, while also placing the new forward_oot method within this class.

@ApsarasX it's on the todo list. we'll first combine common_fused_moe and fused_moe into one. Then refactor to Custom OP register way.

@ApsarasX it's on the todo list. we'll first combine common_fused_moe and fused_moe into one. Then refactor to Custom OP register way.

OK

wangxiyuan · 2025-08-04T07:23:07Z

let's merge this first to fix the known bug. For the next step, we should refactor the fused moe totally.

[Fix] Adjust use_aclgraph logic

71246f6

Signed-off-by: Yizhou Liu <[email protected]>

yiz-liu force-pushed the main-fix-oom branch from f343ea0 to 71246f6 Compare August 1, 2025 01:34

github-actions bot added the module:ops label Aug 1, 2025

ApsarasX suggested changes Aug 2, 2025

View reviewed changes

ApsarasX approved these changes Aug 4, 2025

View reviewed changes

wangxiyuan approved these changes Aug 4, 2025

View reviewed changes

wangxiyuan merged commit a9480d5 into vllm-project:main Aug 4, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Adjust use_aclgraph logic #2156

[Fix] Adjust use_aclgraph logic #2156

yiz-liu commented Aug 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

codecov bot commented Aug 1, 2025 •

edited

Loading

Uh oh!

ApsarasX Aug 2, 2025

Uh oh!

wangxiyuan Aug 4, 2025

Uh oh!

ApsarasX Aug 4, 2025

Uh oh!

wangxiyuan commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

[Fix] Adjust use_aclgraph logic #2156

[Fix] Adjust use_aclgraph logic #2156

Conversation

yiz-liu commented Aug 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

codecov bot commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ApsarasX Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

ApsarasX Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented Aug 4, 2025

Uh oh!

Uh oh!

Uh oh!

yiz-liu commented Aug 1, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 1, 2025 •

edited

Loading