Skip to content

Conversation

mark14wu
Copy link
Collaborator

Add CHECK_LOAD_MASK_PERCENTAGE flag to Profiler to track statistics on masked vs unmasked load/store operations. This helps identify optimization opportunities where masked operations could be replaced with more efficient unmasked variants.

Changes:

  • Add counters for total/masked loads and stores in Profiler
  • Implement mask detection in load/store callbacks
  • Add finalize() output to display mask usage statistics
  • Refactor CHECK_BUFFER_LOAD into a configurable flag
  • Add comprehensive test to verify mask percentage counting

Add CHECK_LOAD_MASK_PERCENTAGE flag to Profiler to track statistics
on masked vs unmasked load/store operations. This helps identify
optimization opportunities where masked operations could be replaced
with more efficient unmasked variants.

Changes:
- Add counters for total/masked loads and stores in Profiler
- Implement mask detection in load/store callbacks
- Add finalize() output to display mask usage statistics
- Refactor CHECK_BUFFER_LOAD into a configurable flag
- Add comprehensive test to verify mask percentage counting
def __init__(
self,
callpath: bool = True,
CHECK_BUFFER_LOAD: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to turn on all these options by default

assert False, "Buffer Load optimization should be used when offsets are within 32-bit range!"

def register_op_callback(self, op_type: type[Op]) -> OpCallbacks:
def _is_mask_all_true(mask: TensorHandle) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You misunderstood here. I wanted to check the percentage of true values in the mask, not "all true".

For example if the mask is [true true false false] then it's 50%
If we have two masks we sum up all trues and falses

@mark14wu mark14wu marked this pull request as ready for review October 18, 2025 03:32
@mark14wu mark14wu changed the title [DEV][PROFILER] Add mask usage percentage tracking to Profiler [DEV][PROFILER] Case 3: Add mask usage percentage tracking to Profiler Oct 18, 2025
@mark14wu
Copy link
Collaborator Author

@Jokeren It's ready for review now. If no more issues, I can run a pass of case 3 checking through TritonBench.

@mark14wu mark14wu merged commit ca09f14 into main Oct 19, 2025
1 check passed
@mark14wu mark14wu deleted the profiler/mask_and_load_percentage branch October 19, 2025 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants