-
Notifications
You must be signed in to change notification settings - Fork 740
feat: Enable tuning for MMA matmul #3961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| simple: selector(FusedMatmulSelector::Simple { | ||
| multi_rows: false, | ||
| tile_matmul: AcceleratedTileKind::Cmma, | ||
| }), | ||
| simple_mma: selector(FusedMatmulSelector::Simple { | ||
| multi_rows: false, | ||
| tile_matmul: AcceleratedTileKind::Mma, | ||
| }), | ||
| simple_multi_rows: selector(FusedMatmulSelector::Simple { | ||
| multi_rows: true, | ||
| tile_matmul: AcceleratedTileKind::Cmma, | ||
| }), | ||
| simple_multi_rows_mma: selector(FusedMatmulSelector::Simple { | ||
| multi_rows: true, | ||
| tile_matmul: AcceleratedTileKind::Mma, | ||
| }), | ||
| double_buffering: selector(FusedMatmulSelector::DoubleBuffering { | ||
| specialized: false, | ||
| tile_matmul: AcceleratedTileKind::Cmma, | ||
| }), | ||
| double_buffering_mma: selector(FusedMatmulSelector::DoubleBuffering { | ||
| specialized: false, | ||
| tile_matmul: AcceleratedTileKind::Mma, | ||
| }), | ||
| specialized: selector(FusedMatmulSelector::DoubleBuffering { | ||
| specialized: true, | ||
| tile_matmul: AcceleratedTileKind::Cmma, | ||
| }), | ||
| specialized_mma: selector(FusedMatmulSelector::DoubleBuffering { | ||
| specialized: true, | ||
| tile_matmul: AcceleratedTileKind::Mma, | ||
| }), | ||
| ordered: selector(FusedMatmulSelector::OrderedDoubleBuffering { | ||
| tile_matmul: AcceleratedTileKind::Cmma, | ||
| }), | ||
| ordered_mma: selector(FusedMatmulSelector::OrderedDoubleBuffering { | ||
| tile_matmul: AcceleratedTileKind::Mma, | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of duplications here. I think a good refactor would be to store the selectors in a vector and select them based on indexing. Unsure how to actually do it, but that would be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just a HashMap or BTreeMap, with the selector as the key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's something to be said for keeping it statically typed so all the variants are properly initialized. So I'd keep it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into, I want to refactor that project with better documentation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3961 +/- ##
==========================================
- Coverage 64.71% 64.65% -0.06%
==========================================
Files 1180 1180
Lines 140328 140452 +124
==========================================
Hits 90816 90816
- Misses 49512 49636 +124 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Pull Request Template
Checklist
cargo run-checkscommand has been executed.Related Issues/PRs
Requires tracel-ai/cubecl#1003
Changes
Adds lower level MMA matmul to tuning for both fused and unfused matmul.
Testing
All tests pass and
burn-lmworks as expected (but 25-40% faster depending on the model).