Skip to content

[ci] Rework test_runs_on plumbing for release workflows. #1100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 6, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions .github/workflows/build_portable_linux_pytorch_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,18 @@ on:
workflow_dispatch:
inputs:
amdgpu_family:
required: true
type: string
type: choice
options:
- gfx110X-dgpu
- gfx1151
- gfx120X-all
- gfx94X-dcgpu
- gfx950-dcgpu
default: gfx94X-dcgpu
Comment on lines +39 to +45
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also

  1. keep this as a freeform string field
  2. add xfail families from https://github.com/ROCm/TheRock/blob/main/build_tools/github_actions/amdgpu_family_matrix.py

I think limiting to only families that we have here makes sense though:

PREFIXES = [
"v2/gfx110X-dgpu",
"v2/gfx1151",
"v2/gfx120X-all",
"v2/gfx94X-dcgpu",
"v2/gfx950-dcgpu",
]

(but we will want to expand that list, and now we have multiple locations to update)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, needing to update multiple locations is unfortunate but I agree that it makes sense to limit it here instead of having a freeform string.

test_runs_on:
description: The runner label to use for tests. Leave this empty to skip testing.
type: string
default: linux-mi325-1gpu-ossci-rocm
python_version:
required: true
type: string
Expand Down Expand Up @@ -152,7 +160,7 @@ jobs:
--index-url "${{ inputs.cloudfront_url }}/${{ inputs.amdgpu_family }}/" \
--clean \
--output-dir ${{ env.PACKAGE_DIST_DIR }} ${{ env.optional_build_prod_arguments }}
python ./build_tools/github_actions/write_torch_version.py
python ./build_tools/github_actions/write_torch_versions.py --dist-dir ${{ env.PACKAGE_DIST_DIR }}

- name: Configure AWS Credentials
if: always()
Expand Down
16 changes: 12 additions & 4 deletions .github/workflows/build_windows_pytorch_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,22 @@ on:
workflow_dispatch:
inputs:
amdgpu_family:
required: true
type: string
type: choice
options:
- gfx110X-dgpu
- gfx1151
- gfx120X-all
- gfx94X-dcgpu
- gfx950-dcgpu
default: gfx1151
test_runs_on:
description: The runner label to use for tests. Leave this empty to skip testing.
type: string
default: windows-strix-halo-gpu-rocm
python_version:
required: true
type: string
default:
default: "3.12"
release_type:
description: The type of release to build ("nightly", or "dev")
type: string
Expand Down Expand Up @@ -137,7 +145,7 @@ jobs:
--clean ^
--output-dir ${{ env.PACKAGE_DIST_DIR }} ^
${{ env.optional_build_prod_arguments }}
python ./build_tools/github_actions/write_torch_version.py
python ./build_tools/github_actions/write_torch_versions.py --dist-dir ${{ env.PACKAGE_DIST_DIR }}

- name: Configure AWS Credentials
if: always()
Expand Down
12 changes: 10 additions & 2 deletions .github/workflows/release_portable_linux_pytorch_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,18 @@ on:
workflow_dispatch:
inputs:
amdgpu_family:
required: true
type: string
type: choice
options:
- gfx110X-dgpu
- gfx1151
- gfx120X-all
- gfx94X-dcgpu
- gfx950-dcgpu
default: gfx94X-dcgpu
test_runs_on:
description: The runner label to use for tests. Leave this empty to skip testing.
type: string
default: linux-mi325-1gpu-ossci-rocm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned that having a default will result in building for one family from the choices but missing to set the correct label here and testing on the wrong family afterwards. Not having a default would mean to know / look up one more label. Though I am not blocking on this even though still think it would be nice to not know the labels at all.

If testing can be skipped here, how does this colludes with #1072?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can try restoring the script we had and fixing it. We're still not running pytorch tests on Windows until this lands though, since gfx1151 is not triggering test jobs :/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed some changes. I'll test them then re-request review once they are ready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

release_type:
description: The type of release to build ("nightly", or "dev")
type: string
Expand Down
12 changes: 10 additions & 2 deletions .github/workflows/release_windows_pytorch_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,18 @@ on:
workflow_dispatch:
inputs:
amdgpu_family:
required: true
type: string
type: choice
options:
- gfx110X-dgpu
- gfx1151
- gfx120X-all
- gfx94X-dcgpu
- gfx950-dcgpu
default: gfx1151
test_runs_on:
description: The runner label to use for tests. Leave this empty to skip testing.
type: string
default: windows-strix-halo-gpu-rocm
release_type:
description: The type of release to build ("nightly", or "dev")
type: string
Expand Down
19 changes: 0 additions & 19 deletions build_tools/github_actions/write_torch_version.py

This file was deleted.

Loading