Skip to content

Conversation

apsonawane
Copy link
Contributor

This PR adds block-wise quant kernel for QMoE CPU

expert_out = expert_layer(token_vec)
contrib = expert_out[0, k].item() * topk_soft[0, idx_e].item()
print(f"Expert {e} contrib at hidden {k}: {contrib}")
except Exception as _:

Check notice

Code scanning / CodeQL

Empty except Note test

'except' clause does nothing but pass and there is no explanatory comment.
@apsonawane apsonawane merged commit 5d17734 into main Sep 15, 2025
89 of 92 checks passed
@apsonawane apsonawane deleted the asonawane/block-wise branch September 15, 2025 15:32
@apsonawane
Copy link
Contributor Author

All the comments and improvements addressed in this PR: #26048

snnn pushed a commit that referenced this pull request Sep 15, 2025
This PR adds block-wise quant kernel for QMoE CPU
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
This PR adds block-wise quant kernel for QMoE CPU
adrianlizarraga pushed a commit that referenced this pull request Sep 26, 2025
This PR adds block-wise quant kernel for QMoE CPU
snnn pushed a commit that referenced this pull request Sep 27, 2025
### Description
Adds the following commits to the `rel-1.23.1` branch for ORT 1.23.1:


- add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart
  - main merge date: July 31, 1:05am
  - pr: #25590
  - commit: e753643
- [build] fix WebAssembly build on macOS/arm64
  - main merge date: Aug 5, 8:07am
  - pr: #25653
  - commit: 53f152b
- [CPU] MoE Kernel (#25958)
  - main merge date: Sept 10, 4:54pm
  - pr: #25958
  - commit: 930e640
- [CPU] Block-wise QMoE kernel for CPU
  - main merge date: Sept 15, 8:32am
  - pr: #26009
  - commit: 5d17734
- [C#] Implement missing APIs
  - main merge date: Sept 24, 10:50am
  - pr: #26101
  - commit: 35dcab5
- Regenerate test model with ONNX IR < 12
  - main merge date: Sept 24, 2:50pm
  - pr: #26149
  - commit: 88f2652
- [CPU] Fix compilation errors because of unused variables
  - main merge date: Sept 25, 1:21pm
  - pr: #26147
  - commit: 42fcd71
- [EP ABI] Check if nodes specified in GetCapability() have already been
assigned
  - main merge date: Sept 26, 1:24am
  - pr: #26156
  - commit: 67d3ba0
- [QNN EP] Add dynamic option to set HTP performance mode
  - main merge date: Sept 26, 11:55am
  - pr: #26135
  - commit: 6cc40fd

---------

Co-authored-by: xieofxie <[email protected]>
Co-authored-by: hualxie <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-ashwshan <[email protected]>
@snnn
Copy link
Member

snnn commented Sep 27, 2025

This PR has been cherry-picked into the rel-1.23.1 branch in PR #26182. Removing the release:1.23.1 label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants