Skip to content

Conversation

TedThemistokleous
Copy link
Collaborator

Description

Remove inline default transposeHelper and ensure we use the proper cjeck via CanUse_hipBlasTransposeHelper_MLFloat16

Motivation and Context

Required as some gfx targets require gridsize for transpse be under the 65535 limit otherwise we'll error out.

Lipamdhip.so will error out in newer ROCm to warn about this but in previous cases we would get undefined behavior if gridsize was larger than anticipated.

…eck via CanUse_hipBlasTransposeHelper_MLFloat16
@TedThemistokleous TedThemistokleous self-assigned this Jan 29, 2025
@TedThemistokleous TedThemistokleous merged commit bb933d4 into rocm6.4_internal_testing Jan 29, 2025
5 of 15 checks passed
@TedThemistokleous TedThemistokleous deleted the fix_transpose_helper branch January 29, 2025 02:36
tianleiwu pushed a commit to microsoft/onnxruntime that referenced this pull request Jan 29, 2025
Remove inline default transposeHelper and ensure we use the proper check
via CanUse_hipBlasTransposeHelper_MLFloat16

Related to change in ROCm Onnxruntime repo:
ROCm#82

### Description

Required to correctly limit grid size of transpose helper kernel

### Motivation and Context
Compile was defaulting to the inline constructor that was removed
instead of using the overloaded case with proper checks.
Removed the inline default "true" case as this is incorrect for newer
AMD cards/targets

Co-authored-by: Ted Themistokleous <[email protected]>
ashrit-ms pushed a commit to microsoft/onnxruntime that referenced this pull request Feb 11, 2025
Remove inline default transposeHelper and ensure we use the proper check
via CanUse_hipBlasTransposeHelper_MLFloat16

Related to change in ROCm Onnxruntime repo:
ROCm#82

### Description

Required to correctly limit grid size of transpose helper kernel

### Motivation and Context
Compile was defaulting to the inline constructor that was removed
instead of using the overloaded case with proper checks.
Removed the inline default "true" case as this is incorrect for newer
AMD cards/targets

Co-authored-by: Ted Themistokleous <[email protected]>
guschmue pushed a commit to microsoft/onnxruntime that referenced this pull request Mar 6, 2025
Remove inline default transposeHelper and ensure we use the proper check
via CanUse_hipBlasTransposeHelper_MLFloat16

Related to change in ROCm Onnxruntime repo:
ROCm#82

### Description

Required to correctly limit grid size of transpose helper kernel

### Motivation and Context
Compile was defaulting to the inline constructor that was removed
instead of using the overloaded case with proper checks.
Removed the inline default "true" case as this is incorrect for newer
AMD cards/targets

Co-authored-by: Ted Themistokleous <[email protected]>
ashrit-ms pushed a commit to microsoft/onnxruntime that referenced this pull request Mar 17, 2025
Remove inline default transposeHelper and ensure we use the proper check
via CanUse_hipBlasTransposeHelper_MLFloat16

Related to change in ROCm Onnxruntime repo:
ROCm#82

### Description

Required to correctly limit grid size of transpose helper kernel

### Motivation and Context
Compile was defaulting to the inline constructor that was removed
instead of using the overloaded case with proper checks.
Removed the inline default "true" case as this is incorrect for newer
AMD cards/targets

Co-authored-by: Ted Themistokleous <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant