Skip to content

Conversation

Tombana
Copy link
Collaborator

@Tombana Tombana commented Apr 21, 2022

What do these changes do?

This adds a bazel cache for release jobs:

  • if they timeout we can simply restart them and they should finish within the time limit
  • if they fail for any other reason we can try fixes without having to wait for 6 hours

How Has This Been Tested?

It is running here: https://github.com/larq/compute-engine/actions/runs/2200323727

@Tombana Tombana requested a review from a team April 21, 2022 07:22
@Tombana
Copy link
Collaborator Author

Tombana commented Apr 21, 2022

New run with different folders for each python version, and with passing the credentials into the manylinux container: https://github.com/larq/compute-engine/actions/runs/2200491384

Copy link
Member

@lgeiger lgeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I just have one minor comment, other than that this looks great!

- name: Build macOS wheels
run: |
python --version
python -m pip install delocate wheel setuptools numpy six --no-cache-dir

./configure.py

if [[ -n $GOOGLE_APPLICATION_CREDENTIALS ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the release workflow this should always exist, but I agree it's probably good to keep this check just to be safe.

@Tombana
Copy link
Collaborator Author

Tombana commented Apr 21, 2022

On the previous run only the python 3.7 job failed with an internal gcc error.
I've applied Lukas' suggestion of mounting the gcloud credentials directly, and I've set off another run: https://github.com/larq/compute-engine/actions/runs/2202486415
If the cache is working as expected these should now all finish quickly.

@Tombana
Copy link
Collaborator Author

Tombana commented Apr 21, 2022

The cache is working but not completely: the builds take about ~25 minutes, so something is working but it's still rebuilding some parts.
(Some of the failing workflows are unrelated, caused by a failed download)

@CNugteren
Copy link
Contributor

25 minutes should be good enough. It saves a lot of time, and it makes the Windows builds pass. However I still see some builds ongoing at the moment, so some might take longer?

@Tombana
Copy link
Collaborator Author

Tombana commented Apr 21, 2022

25 minutes should be good enough. It saves a lot of time, and it makes the Windows builds pass. However I still see some builds ongoing at the moment, so some might take longer?

Indeed. I don't really know why the cache is only partially working, but I think this PR can be merged nonetheless, as it already helps.

@Tombana Tombana merged commit db5701f into main Apr 21, 2022
@Tombana Tombana deleted the release_bazel_cache branch April 21, 2022 15:19
@CNugteren CNugteren added the internal-improvement Internal Improvements and Maintenance label Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal-improvement Internal Improvements and Maintenance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants