Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
66efa4c
Add stream-based prototype
j-stephan Aug 21, 2025
76c7808
Add graph-capturing prototype
j-stephan Aug 27, 2025
e3549f8
Refactoring and geometry fixes
j-stephan Sep 1, 2025
1586d52
Disable graph capture tutorial on Windows
j-stephan Sep 2, 2025
b0be7aa
Graph creation and smaller fixes
j-stephan Sep 4, 2025
0297069
Address review comments
j-stephan Sep 4, 2025
80210d4
CMake fixes
j-stephan Sep 4, 2025
5c38269
Refactor and align all three versions
j-stephan Sep 4, 2025
cabd8eb
Use a phantom by default
j-stephan Sep 15, 2025
7e93ca4
Correct graph layout and bugfixes
j-stephan Sep 16, 2025
c385be0
Fix missing `<cstring>` include
j-stephan Sep 17, 2025
84db17f
Fix datatype for std::max
j-stephan Sep 17, 2025
851eafd
Fix identifier naming
j-stephan Sep 17, 2025
943f8c8
Small compatibility fixes
j-stephan Sep 17, 2025
6fe56f8
Explicit casts
j-stephan Sep 17, 2025
6a2cc82
Fix UB
j-stephan Sep 18, 2025
5d49004
Allow concurrent backprojection kernels
j-stephan Sep 22, 2025
4f91f93
Bug fix
j-stephan Sep 22, 2025
deaab0d
Use different capturing approach
j-stephan Sep 22, 2025
877a024
Second try
j-stephan Sep 22, 2025
5bc6f06
Fix double graph bug
j-stephan Sep 25, 2025
b7ceefe
Sphinx annotations
j-stephan Sep 26, 2025
fa0ad25
Add timing information
j-stephan Sep 26, 2025
e54f588
Remove dataset option
j-stephan Sep 26, 2025
8bd18a7
Remove subvolume feature
j-stephan Sep 26, 2025
5582bc4
Disable synchronization before graph update
j-stephan Sep 27, 2025
b0dee3d
Remove synchronization points and improve timing
j-stephan Sep 29, 2025
fafc219
Improve status messages
j-stephan Sep 30, 2025
1ad138a
Synchronize before exiting
j-stephan Sep 30, 2025
916ef13
Improve parallelism
j-stephan Sep 30, 2025
92673ae
One graph per branch
j-stephan Sep 30, 2025
06175bd
Evil node-level hackery
j-stephan Sep 30, 2025
d5dc854
Serialize everything
j-stephan Sep 30, 2025
b229964
Speed up dataset generation
j-stephan Oct 1, 2025
a9784b7
Disable node sorting
j-stephan Oct 1, 2025
e4f2842
Revert previous commit
j-stephan Oct 1, 2025
8960363
More sphinx markers
j-stephan Oct 1, 2025
ee03c08
Update libTIFF version
j-stephan Oct 1, 2025
5c73d5f
Add sphinx markers
j-stephan Oct 2, 2025
099ef9d
Align CMake
j-stephan Oct 8, 2025
a6c0675
Update READMEs and gitignore
j-stephan Oct 8, 2025
b974bc2
Code hygiene
j-stephan Oct 8, 2025
f0d5c2e
Support GPUs without texture instructions
j-stephan Oct 10, 2025
3d5a7e6
Explicitly install git
j-stephan Oct 10, 2025
4024033
Explicitly install hipFFT
j-stephan Oct 10, 2025
827497c
Install libTIFF
j-stephan Oct 10, 2025
b75d5fa
Print duration.count() instead of duration
j-stephan Oct 10, 2025
d0bcb10
Remove <omp.h>
j-stephan Oct 10, 2025
7005820
Enable separable compilation and remove link to hip::host
j-stephan Oct 10, 2025
44b313e
Mirror hipFFT examples lookup of libraries and flags
j-stephan Oct 13, 2025
fea6fdb
Always use CONFIG mode for find_package
j-stephan Oct 13, 2025
976d2d0
Ensure CMake understands we want to compile for NVIDIA
j-stephan Oct 13, 2025
72842c2
Next try
j-stephan Oct 13, 2025
ff0fa26
Compile with warnings and add CUDA helpers
j-stephan Oct 13, 2025
3814510
Correct hipStreamWaitEvent signature
j-stephan Oct 13, 2025
99d42a5
Enforce NVIDIA
j-stephan Oct 13, 2025
1cd6520
More CUDA fixes
j-stephan Oct 13, 2025
2a06455
Link to CUDA driver
j-stephan Oct 13, 2025
6fe57d3
Remove obsolete TIFF handling
j-stephan Oct 13, 2025
a0d6db0
Fix remaining warnings
j-stephan Oct 13, 2025
0767028
LibTIFF cleanup
j-stephan Oct 13, 2025
f1cfe9f
Add Windows note
j-stephan Oct 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/build_hip_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
pull_request:
branches: [ amd-staging, amd-mainline, release/** ]
paths:
- 'HIP-Programmin-Guide/**'
- 'HIP-Doc/**'
- '.github/workflows/**'

env:
Expand All @@ -29,7 +29,7 @@ jobs:
run: |
apt-get update -qq &&
apt-get install -y build-essential g++ glslang-tools \
python3 python3-pip libglfw3-dev libvulkan-dev locales wget
python3 python3-pip libglfw3-dev libvulkan-dev locales wget git libtiff-dev
python3 -m pip install --upgrade pip
python3 -m pip install cmake
- name: Install ROCm Dev
Expand All @@ -38,7 +38,7 @@ jobs:
wget https://repo.radeon.com/amdgpu-install/${{ env.ROCM_VERSION }}/ubuntu/jammy/amdgpu-install_${{ env.AMDGPU_INSTALLER_VERSION }}_all.deb
apt-get -y install ./amdgpu-install_${{ env.AMDGPU_INSTALLER_VERSION }}_all.deb &&
apt-get update -qq &&
apt-get -y install rocm-dev rocm-llvm-dev
apt-get -y install rocm-dev rocm-llvm-dev hipfft-dev
echo "/opt/rocm/bin" >> $GITHUB_PATH
echo "ROCM_PATH=/opt/rocm" >> $GITHUB_ENV
echo "LD_LIBRARY_PATH=/opt/rocm/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/build_hip_documentation_cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
run: |
apt-get update -qq &&
apt-get install -y build-essential g++ glslang-tools \
python3 python3-pip locales wget
python3 python3-pip locales wget git libtiff-dev
python3 -m pip install --upgrade pip
python3 -m pip install cmake

Expand All @@ -40,13 +40,13 @@ jobs:
wget https://repo.radeon.com/amdgpu-install/${{ env.ROCM_VERSION }}/ubuntu/jammy/amdgpu-install_${{ env.AMDGPU_INSTALLER_VERSION }}_all.deb
apt-get -y install ./amdgpu-install_${{ env.AMDGPU_INSTALLER_VERSION }}_all.deb &&
apt-get update -qq &&
apt-get install -y hip-dev hipify-clang
apt-get install -y hip-dev hipify-clang hipfft-dev

- name: Configure and Build
shell: bash
# The CMAKE_POLICY_VERSION_MINIMUM environment variable can be removed once the CMake updates from ROCm 7.0 are available
run: |
cd HIP-Doc && mkdir build && cd build
export CMAKE_POLICY_VERSION_MINIMUM="3.5"
cmake -DGPU_RUNTIME=CUDA ..
cmake -DROCM_EXAMPLES_GPU_LANGUAGE=CUDA -DHIP_PLATFORM=nvidia ..
cmake --build . -j
1 change: 1 addition & 0 deletions HIP-Doc/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ include(CTest)

add_subdirectory(Programming-Guide)
add_subdirectory(Reference)
add_subdirectory(Tutorials)

include("${CMAKE_CURRENT_LIST_DIR}/../Common/HipPlatform.cmake")
select_gpu_language()
Expand Down
3 changes: 2 additions & 1 deletion HIP-Doc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@

SUBDIRECTORIES := \
Programming-Guide \
Reference
Reference \
Tutorials

all: $(SUBDIRECTORIES)

Expand Down
28 changes: 28 additions & 0 deletions HIP-Doc/Tutorials/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# MIT License
#
# Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

cmake_minimum_required(VERSION 3.21 FATAL_ERROR)
project(HIP-Doc-Tutorials LANGUAGES CXX)

include(CTest)

add_subdirectory(graph_api)
48 changes: 48 additions & 0 deletions HIP-Doc/Tutorials/graph_api/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# MIT License
#
# Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

cmake_minimum_required(VERSION 3.21)

project(hip_graph_api_tutorial LANGUAGES CXX)

include("${CMAKE_CURRENT_LIST_DIR}/../../../Common/HipPlatform.cmake")
select_gpu_language()
enable_language(${ROCM_EXAMPLES_GPU_LANGUAGE})
select_hip_platform()

if(CMAKE_SYSTEM_NAME MATCHES "Windows")
set(ROCM_ROOT
"$ENV{HIP_PATH}"
CACHE PATH
"Root directory of the ROCm installation"
)
else()
set(ROCM_ROOT
"/opt/rocm"
CACHE PATH
"Root directory of the ROCm installation"
)
endif()

list(APPEND CMAKE_PREFIX_PATH "${ROCM_ROOT}")

add_subdirectory(src)
151 changes: 151 additions & 0 deletions HIP-Doc/Tutorials/graph_api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# HIP-Doc HIP Graph API tutorial

## Description

This example demonstrates how to port an existing stream-based application to a graph-based application. For more
information on this topic, please refer to the
[HIP Documentation](https://rocm.docs.amd.com/projects/HIP/en/latest/tutorial/graph_api.html).

### Prerequisites

On Windows, this example can only be built by CMake and the Ninja generator. Visual Studio generators are not supported.

### Application flow

#### Stream variant

1. The number of asynchronous engines on the device is queried.
2. A phantom volume is created and forward projections are created from it.
3. The required data buffers, hipFFT plans, etc. are set up.
4. The forward projections are processed in batches. The batch size is equal to the number of asynchronous engines. Each
batch item is processed in its own stream. Each projection is processed in the following way:
1. All pixels in the projection are normalized, i.e. transformed from a 12-bit unsigned integer value to a 32-bit
floating-point value in the range [0, 1].
2. The projection is log transformed.
3. The projection is weighted.
4. The projection is transformed into Fourier space.
5. The projection is filtered in Fourier space.
6. The filtered projection is transformed into real space.
7. The filtered projection is back-projected.
5. The resulting volume is saved to disk.
6. The data is cleaned up.

#### Capturing variant

1. The number of asynchronous engines on the device is queried.
2. A phantom volume is created and forward projections are created from it.
3. The required data buffers, hipFFT plans, etc. are set up.
4. The forward projections are processed in batches. The batch size is equal to the number of asynchronous engines. Each
batch item is processed in its own stream. All processing streams are captured into a HIP graph template.
Each projection is processed in the following way:
1. All pixels in the projection are normalized, i.e. transformed from a 12-bit unsigned integer value to a 32-bit
floating-point value in the range [0, 1].
2. The projection is log transformed.
3. The projection is weighted.
4. The projection is transformed into Fourier space.
5. The projection is filtered in Fourier space.
6. The filtered projection is transformed into real space.
7. The filtered projection is back-projected.
5. The resulting graph template is instantiated and launched on a stream.
6. The resulting volume is saved to disk.
7. The data is cleaned up.

#### Manual creation variant

1. The number of asynchronous engines on the device is queried.
2. A phantom volume is created and forward projections are created from it.
3. The required data buffers, hipFFT plans, etc. are set up.
4. The forward projections are processed in batches. The batch size is equal to the number of asynchronous engines. Each
batch item represents a branch in the graph template, and each branch consists of the following steps where each step
represents a node in the graph:
1. All pixels in the projection are normalized, i.e. transformed from a 12-bit unsigned integer value to a 32-bit
floating-point value in the range [0, 1].
2. The projection is log transformed.
3. The projection is weighted.
4. The projection is transformed into Fourier space.
5. The projection is filtered in Fourier space.
6. The filtered projection is transformed into real space.
7. The filtered projection is back-projected.
5. The resulting graph template is instantiated and launched on a stream.
6. The resulting volume is saved to disk.
7. The data is cleaned up.

## Demonstrated API calls

### HIP Runtime

#### Device symbols

* `atomicAdd`
* `blockIdx`
* `blockDim`
* `cosf`
* `expf`
* `fabsf`
* `floorf`
* `fmaxf`
* `fminf`
* `logf`
* `max`
* `min`
* `roundf`
* `sinf`
* `sqrtf`
* `tex2D`
* `threadIdx`

#### Host symbols

* `hipCreateChannelDesc`
* `hipCreateTextureObject`
* `hipDestroyTextureObject`
* `hipDeviceSynchronize`
* `hipEventCreate`
* `hipEventDestroy`
* `hipEventRecord`
* `hipFree`
* `hipFreeHost`
* `hipGetDeviceProperties`
* `hipGetErrorString`
* `hipGraphAddKernelNode`
* `hipGraphAddMemcpyNode`
* `hipGraphAddMemsetNode`
* `hipGraphCreate`
* `hipGraphDebugDotPrint`
* `hipGraphDestroy`
* `hipGraphExecDestroy`
* `hipGraphExecUpdate`
* `hipGraphGetNodes`
* `hipGraphInstantiate`
* `hipGraphLaunch`
* `hipGraphNodeGetDependentNodes`
* `hipHostMalloc`
* `hipLaunchHostFunc`
* `hipMalloc`
* `hipMalloc3D`
* `hipMallocPitch`
* `hipMemcpy2DAsync`
* `hipMemcpy3DAsync`
* `hipMemGetInfo`
* `hipMemset2DAsync`
* `hipMemset3D`
* `hipStreamBeginCapture`
* `hipStreamBeginCaptureToGraph`
* `hipStreamCreate`
* `hipStreamEndCapture`
* `hipStreamDestroy`
* `hipStreamSynchronize`
* `hipStreamWaitEvent`
* `make_hipExtent`
* `make_hipPitchedPtr`
* `make_hipPos`

### hipFFT library symbols

* `hipfftCreate`
* `hipfftExecC2R`
* `hipfftExecR2C`
* `hipfftDestroy`
* `hipfftMakePlanMany`
* `hipfftPlan1d`
* `hipfftSetStream`
5 changes: 5 additions & 0 deletions HIP-Doc/Tutorials/graph_api/src/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
hip_graph_api_tutorial_streams
hip_graph_api_tutorial_graph_capture
hip_graph_api_tutorial_graph_creation
tiff.dll
libtiff.so*
Loading