-
Notifications
You must be signed in to change notification settings - Fork 21
[DEV][PROFILER] Case 3: Add mask usage percentage tracking to Profiler #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add CHECK_LOAD_MASK_PERCENTAGE flag to Profiler to track statistics on masked vs unmasked load/store operations. This helps identify optimization opportunities where masked operations could be replaced with more efficient unmasked variants. Changes: - Add counters for total/masked loads and stores in Profiler - Implement mask detection in load/store callbacks - Add finalize() output to display mask usage statistics - Refactor CHECK_BUFFER_LOAD into a configurable flag - Add comprehensive test to verify mask percentage counting
def __init__( | ||
self, | ||
callpath: bool = True, | ||
CHECK_BUFFER_LOAD: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lower case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to turn on all these options by default
assert False, "Buffer Load optimization should be used when offsets are within 32-bit range!" | ||
|
||
def register_op_callback(self, op_type: type[Op]) -> OpCallbacks: | ||
def _is_mask_all_true(mask: TensorHandle) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You misunderstood here. I wanted to check the percentage of true values in the mask, not "all true".
For example if the mask is [true true false false]
then it's 50%
If we have two masks we sum up all trues and falses
@Jokeren It's ready for review now. If no more issues, I can run a pass of case 3 checking through TritonBench. |
Add CHECK_LOAD_MASK_PERCENTAGE flag to Profiler to track statistics on masked vs unmasked load/store operations. This helps identify optimization opportunities where masked operations could be replaced with more efficient unmasked variants.
Changes: