Skip to content

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented May 27, 2025

Description

Contributes to rapidsai/build-planning#181

  • removes all uploads of conda packages and wheels to downloads.rapids.ai

Contributes to rapidsai/build-planning#135

  • adds shellcheck checks to pre-commit configuration

Also proposes changes to get around the cold-start problem we have at the beginning of every release, where test-conda-nightly-env cannot run because there aren't yet published rapids packages.

  • makes test-conda-nightly-env dependent on the just-built-in-CI rapids packages, instead of installing them from rapidsai-nightly

Notes for Reviewers

How I identified changes

Looked for uses of the relevant gha-tools tools, as well as documentation about downloads.rapids.ai, being on the NVIDIA VPN, using S3, etc. like this:

git grep -i -E 's3|upload|downloads\.rapids|vpn'

How I tested this

See "How I tested this" on rapidsai/shared-workflows#364

@copy-pr-bot

This comment was marked as resolved.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

CUDA 12 builds are all failing with some variation of this:

ExplainedDependencyNeedsBuildingError: Unsatisfiable dependencies for platform linux-64: {MatchSpec("python_abi=3.10[build=*_cp310]"), MatchSpec("rapids-xgboost==25.08.00a=cuda12_py310_250527_gd1ff3c5_13")}
...
├─ rapids-xgboost =25.8 * is not installable because it requires
│  ├─ libxgboost ==2.1.4 rapidsai_h* but there are no viable options
│  │  ├─ libxgboost 2.1.4 would require

(build link)

CUDA 11 builds have been running for 2+ hours (!!!), stuck on what looks like a conda solve setting up the host environment? Final lines of logs:

Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Reloading output folder: ...working... done

(build link)

The latest nightlies failed in the same way... CUDA 11 jobs timed out after 6 hours, CUDA 12 jobs failed complaining about there not being a rapids-xgboost.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

The test-conda-nightly-env job skips the rapids package for check purposes:

However, it requires at least one copy of that package to exist on the rapidsai-nightly channel.

We're at the beginning of the 25.08 release and so there aren't 25.08 packages yet, meaning that check fails:

"error": "PackagesNotFoundError: The following packages are not available from current channels:\n\n - rapids=25.8\n\nCurrent channels:\n\n - https://conda-cache.local.gha-runners.nvidia.com/rapidsai-nightly\n - https://conda-cache.local.gha-runners.nvidia.com/conda-forge\n - https://conda-cache.local.gha-runners.nvidia.com/nvidia\n - https://conda-cache.local.gha-runners.nvidia.com/rapidsai\n - https://conda-cache.local.gha-runners.nvidia.com/dask/label/dev\n\nTo search for alternate channels that may provide the conda package you're\nlooking for, navigate to\n\n https://anaconda.org\n\nand use the search bar at the top of the page.\n",

(build link)

That could be resolved by admin-merging this PR, but we could also avoid that by the following set of changes:

  • making test-conda-nightly-env run AFTER the builds
  • installing the just-built-in-CI packages

I'm going to try that.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

Looks like dropping CUDA 11 in rapidsai/shared-workflows#371 was the last piece needed for this!

Putting it up for review.

@jameslamb jameslamb changed the title WIP: stop uploading to downloads.rapids.ai, add shellcheck stop uploading to downloads.rapids.ai, add shellcheck May 30, 2025
@jameslamb jameslamb requested a review from bdice May 30, 2025 14:18
@jameslamb jameslamb marked this pull request as ready for review May 30, 2025 14:18
@jameslamb jameslamb requested a review from a team as a code owner May 30, 2025 14:18
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of me wants to force the use of rapidsai-nightly because otherwise the commands in our installation docs can fail without any signal in CI.

The rapids metapackage is special because it doesn’t get uploaded with successful nightly builds. It is only built manually. In other words, the cold start problem where tests fail for a new release is a feature (forces us to be aware of the missing package), not a bug.

@jameslamb
Copy link
Member Author

Over in the rapidsai/docker repo, we do have installation of the rapids metapackage:

https://github.com/rapidsai/docker/blob/1326f1c60e4bbcb30da8857a5eba69fb05a67366/Dockerfile#L76-L77

So nightly runs there do give us some signal about dependency issues with the metapackage.

But that's a bit indirect, to be fair. Thanks for the explanation, I'll revert those changes here and we can ask for this to be admin-merged.

@jameslamb jameslamb requested a review from bdice May 30, 2025 14:41
@jameslamb
Copy link
Member Author

I've reverted the testing changes here. Once this is passing all the CI besides test-conda-nightly-env and is approved, I can go ask for it to be admin-merged.

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit e12ef0e into rapidsai:branch-25.08 May 30, 2025
23 checks passed
@jameslamb jameslamb deleted the ci/stop-s3-uploads branch May 30, 2025 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants