-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Labels
bugSomething isn't workingSomething isn't workingecosystem: PyTorchIssue pertains to PyTorch and related librariesIssue pertains to PyTorch and related librariesstatus: triageIndicates an issue has been assigned for investigation.Indicates an issue has been assigned for investigation.
Description
Problem Description
On recent TheRock + torch nightly releases, encountering HIPBLAS_STATUS_ALLOC_FAILED
whenever attempting to use torch to do matrix stuff, even on small matrices.
Torch version: 2.10.0a0+rocm7.9.0rc20251007
Operating System
Ubuntu 24.04.3 LTS (Noble Numbat)
CPU
Intel(R) Core(TM) i9-14900K
GPU
AMD Radeon RX 7900 XTX
ROCm Version
7.9.0
ROCm Component
No response
Steps to Reproduce
-
Install ROCm + torch following the directions here: https://github.com/ROCm/TheRock/blob/main/RELEASES.md#index-page-listing
-
Run sample reproducer:
import torch
device = torch.device("cuda")
# Create two random matrices on the GPU
A = torch.randn(128, 128, device=device)
B = torch.randn(128, 128, device=device)
# Perform matrix multiplication
C = torch.matmul(A, B)
$ python test2.py
Traceback (most recent call last):
File "test.py", line 10, in <module>
C = torch.matmul(A, B)
^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: HIPBLAS_STATUS_ALLOC_FAILED when calling `hipblasCreate(handle)`
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
### Additional Information
_No response_
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingecosystem: PyTorchIssue pertains to PyTorch and related librariesIssue pertains to PyTorch and related librariesstatus: triageIndicates an issue has been assigned for investigation.Indicates an issue has been assigned for investigation.