Skip to content

Releases: ROCm/hipBLAS

hipBLAS 0.45.0 for ROCm 4.3.0

30 Jul 22:52
63afcb3

Choose a tag to compare

Added

  • Added hipblasStatusToString

Fixed

  • Added catch() blocks around API calls to prevent the leak of C++ exceptions

hipBLAS-0.44.0 for ROCm 4.2.0

10 May 23:17
00b0358

Choose a tag to compare

Added

  • Made necessary changes to work with rocBLAS' gemm_ex changes. When using rocBLAS backend, hipBLAS will query the preferable
    layout of int8 data to be passed to gemm_ex, and will pass in the resulting flag. Users must be sure to use the preferable
    data format when calling gemm_ex with a rocBLAS backend.
  • Added hipblas-bench with support for:
    • copy, swap, scal

hipBLAS-0.42.0 for ROCm 4.1.0

23 Mar 01:18
745b744

Choose a tag to compare

Added
Added the following functions. All added functions include batched and strided-batched support with rocBLAS backend:
axpy_ex
dot_ex
nrm2_ex
rot_ex
scal_ex

Fixed
Fixed complex unit test bug caused by incorrect caxpy and zaxpy function signatures

Known Issues

  • None

hipBLAS-0.38.0 for ROCm 4.0.0

18 Dec 15:22
400d551

Choose a tag to compare

New Features

  • No new features

Known Issues

  • None

hipBLAS-0.38.0 for ROCm 3.10.0

30 Nov 17:02
400d551

Choose a tag to compare

New Features

  • Added hipblasSetAtomicsMode and hipblasGetAtomicsMode
  • No longer look for CUDA backend unless --cuda build flag is passed

Known Issues

  • None

hipBLAS-0.36.0 for ROCm 3.9.0

27 Oct 20:13
e4d9e7b

Choose a tag to compare

New Features

Known Issues

  • None

hipBLAS-0.34.0 for ROCm 3.8.0

18 Sep 21:32
50b865f

Choose a tag to compare

New Features

  • No new features

Known Issues

  • None

hipBLAS-0.32.0 for ROCm 3.7.0

15 Aug 04:26
abd7261

Choose a tag to compare

New Features

  • Improvements to rocblas_Xgemm_batched performance for small m, n, k.
  • Improvements to rocblas_Xgemv_batched and rocblas_Xgemv_strided_batched performance for small m (QMCPACK use).
  • Improvements to rocblas_Xdot (batched and non-batched) performance when both incx and incy are 1
  • FP32 ONNX BERT MI50 performance improved 28%
  • FP32 BDAS MI50/MI60 Performance improved significantly
  • Added substitution method for small trsm sizes with m <= 64 && n <= 64. Increases performance drastically for small batched trsm.
  • Add Fortran interface for BLAS 1, BLAS 2, BLAS 3
  • Add tbsv, tbsv_batched, and tbsv_strided_batched
  • Add hemm, hemm_batched, and hemm_strided_batched
  • Add symm, symm_batched, and symm_strided_batched
  • Add complex versions of geam, along with geam_batched and geam_strided_batched
  • Add gemm_batched_ex and gemm_strided_batched_ex

Known Issues

  • None

hipBLAS-0.30.0 for ROCm 3.6.0

11 Jul 00:38
cb328a4

Choose a tag to compare

New Features

  • Add Fortran interface for BLAS 1, BLAS 2, BLAS 3
  • Add tbsv, tbsv_batched, and tbsv_strided_batched
  • Add hemm, hemm_batched, and hemm_strided_batched
  • Add symm, symm_batched, and symm_strided_batched
  • Add complex versions of geam, along with geam_batched and geam_strided_batched
  • Add gemm_batched_ex and gemm_strided_batched_ex

Known Issues

  • None

hipBLAS-0.28.0 for ROCm 3.5.0

01 Jun 19:52
7d2cb89

Choose a tag to compare

New Features

  • Switched to hip-clang as default compiler and deprecated hcc build

Known Issues

None