Skip to content

[Fix] Adjust use_aclgraph logic #2156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 4, 2025
Merged

Conversation

yiz-liu
Copy link
Contributor

@yiz-liu yiz-liu commented Aug 1, 2025

What this PR does / why we need it?

Updates the FusedMoE method to determine whether to use ACL Graph based on the torchair_graph_config

This is equivalent to #2154 on v0.9.1-dev.

Does this PR introduce any user-facing change?

None.

How was this patch tested?

None needed.

Copy link

github-actions bot commented Aug 1, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link

codecov bot commented Aug 1, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.67%. Comparing base (8cf97d8) to head (71246f6).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/ops/common_fused_moe.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2156      +/-   ##
==========================================
- Coverage   76.67%   76.67%   -0.01%     
==========================================
  Files         107      107              
  Lines       11968    11972       +4     
==========================================
+ Hits         9177     9180       +3     
- Misses       2791     2792       +1     
Flag Coverage Δ
unittests 76.67% <83.33%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -33,7 +34,15 @@ def unquantized_fused_moe_init_func(self, *args, **kwargs):
original_unquantized_fused_moe_init_func(self, *args, **kwargs)
vllm_config = get_current_vllm_config()
self.max_num_batched_tokens = vllm_config.scheduler_config.max_num_batched_tokens
self.use_aclgraph = vllm_config.compilation_config.level == CompilationLevel.PIECEWISE and not vllm_config.model_config.enforce_eager

ascend_config = get_ascend_config()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest referring to #1841 and adding a new class called AscendUnquantizedFusedMoEMethod, incorporating the ACLGraph switch into this class's __init__ method, while also placing the new forward_oot method within this class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApsarasX it's on the todo list. we'll first combine common_fused_moe and fused_moe into one. Then refactor to Custom OP register way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApsarasX it's on the todo list. we'll first combine common_fused_moe and fused_moe into one. Then refactor to Custom OP register way.

OK

@wangxiyuan
Copy link
Collaborator

let's merge this first to fix the known bug. For the next step, we should refactor the fused moe totally.

@wangxiyuan wangxiyuan merged commit a9480d5 into vllm-project:main Aug 4, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants