Skip to content

Commit 206e824

Browse files
committed
internal/ci: drop "evict caches" nightly cron job
Instead, run the trybot workflow every night via a cron schedule, and have the workflow skip the use of caching in that scenario. There are two noteworthy changes compared to the old mechanism: 1) We no longer fully evict (delete) all caches on a nightly basis. This is no longer necessary on Namespace, as the performance of the cache doesn't get noticeably affected by the size anymore, and Namespace already deal with trimming and evicting caches. As we stop evicting caches, we don't need to repopulate them either. 2) Because cron schedules on GitHub Actions only trigger on the default branch by design, we no longer run the nightly trybot jobs on other protected branches such as release branches. This seems fine; we mainly care about catching test flakes on the development branch. In the past two years, we have only used the release branches to backport fixes, and we have never needed to fix test bugs in them. The trybot workflow's workflow_dispatch trigger is now unnecessary, as it was only being used directly by the evict_caches nightly workflow. However, leave it around because it is a generally useful way to trigger a run without needing to do a git push. Document it as such. While here, update the default version used by the "early checks" step. Signed-off-by: Daniel Martí <[email protected]> Change-Id: I38dbbcc9d928ea7267edf81bbdbe69929885ab92 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1217930 Reviewed-by: Paul Jolly <[email protected]> TryBot-Result: CUEcueckoo <[email protected]> Unity-Result: CUE porcuepine <[email protected]>
1 parent a6df86a commit 206e824

File tree

5 files changed

+11
-253
lines changed

5 files changed

+11
-253
lines changed

.github/workflows/evict_caches.yaml

Lines changed: 0 additions & 120 deletions
This file was deleted.

.github/workflows/trybot.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
name: TryBot
44
"on":
5+
schedule:
6+
- cron: 0 2 * * *
57
push:
68
branches:
79
- ci/test
@@ -82,7 +84,7 @@ jobs:
8284
8385
# Dump env for good measure
8486
go env
85-
- if: matrix.runner != 'namespace-profile-windows-2022-amd64-8x16'
87+
- if: github.event_name != 'schedule' && matrix.runner != 'namespace-profile-windows-2022-amd64-8x16'
8688
uses: namespacelabs/nscloud-cache-action@v1
8789
with:
8890
cache: go

internal/ci/base/gerrithub.cue

Lines changed: 3 additions & 128 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,19 @@ package base
55
import (
66
"encoding/json"
77
"strings"
8+
89
"cue.dev/x/githubactions"
910
)
1011

1112
// trybotWorkflows is a template for trybot-based repos
1213
trybotWorkflows: {
1314
(trybot.key): githubactions.#Workflow & {
15+
// Triggering a trybot job via a workflow_dispatch can be a useful way
16+
// to manually or automatically start a job without needing to git push.
1417
on: workflow_dispatch: {}
1518
}
1619
"\(trybot.key)_dispatch": trybotDispatchWorkflow
1720
"push_tip_to_\(trybot.key)": pushTipToTrybotWorkflow
18-
"evict_caches": evictCaches
1921
}
2022

2123
#dispatch: {
@@ -204,133 +206,6 @@ pushTipToTrybotWorkflow: bashWorkflow & {
204206

205207
}
206208

207-
// evictCaches removes "old" GitHub actions caches from the main repo and the
208-
// accompanying trybot The job is only run in the main repo, because
209-
// that is the only place where the credentials exist.
210-
//
211-
// The GitHub actions caches in the main and trybot repos can get large. So
212-
// large in fact we got the following warning from GitHub:
213-
//
214-
// "Approaching total cache storage limit (34.5 GB of 10 GB Used)"
215-
//
216-
// Yes, you did read that right.
217-
//
218-
// Not only does this have the effect of causing us to breach "limits" it also
219-
// means that we can't be sure that individual caches are not bloated.
220-
//
221-
// Fix that by purging the actions caches on a daily basis at 0200, followed 15
222-
// mins later by a re-run of the tip trybots to repopulate the caches so they
223-
// are warm and minimal.
224-
//
225-
// In testing with @mvdan, this resulted in cache sizes for Linux dropping from
226-
// ~1GB to ~125MB. This is a considerable saving.
227-
//
228-
// Note this currently removes all cache entries, regardless of whether they
229-
// are go-related or not. We should revisit this later.
230-
evictCaches: bashWorkflow & {
231-
name: "Evict caches"
232-
233-
on: {
234-
schedule: [
235-
{cron: "0 2 * * *"},
236-
]
237-
}
238-
239-
jobs: {
240-
test: {
241-
// We only want to run this in the main repo
242-
if: "${{github.repository == '\(githubRepositoryPath)'}}"
243-
"runs-on": linuxMachine
244-
steps: [
245-
for v in checkoutCode {v},
246-
247-
// TODO(mvdan): remove once we've fully moved to Namespace runners.
248-
githubactions.#Step & {
249-
name: "Delete caches"
250-
run: """
251-
echo ${{ secrets.\(botGitHubUserTokenSecretsKey) }} | gh auth login --with-token
252-
for i in \(githubRepositoryURL) \(trybotRepositoryURL)
253-
do
254-
echo "Evicting caches for $i"
255-
gh cache delete --repo $i --all --succeed-on-no-caches
256-
done
257-
"""
258-
},
259-
260-
githubactions.#Step & {
261-
name: "Trigger workflow runs to repopulate caches"
262-
let branchPatterns = strings.Join(protectedBranchPatterns, " ")
263-
264-
run: """
265-
# Prepare git for pushes to trybot repo. Note
266-
# because we have already checked out code we don't
267-
# need origin. Fetch origin default branch for later use
268-
git config user.name \(botGitHubUser)
269-
git config user.email \(botGitHubUserEmail)
270-
git config http.https://github.com/.extraheader "AUTHORIZATION: basic $(echo -n \(botGitHubUser):${{ secrets.\(botGitHubUserTokenSecretsKey) }} | base64)"
271-
git remote add trybot \(trybotRepositoryURL)
272-
273-
# Now trigger the most recent workflow run on each of the default branches.
274-
# We do this by listing all the branches on the main repo and finding those
275-
# which match the protected branch patterns (globs).
276-
for j in $(\(curlGitHubAPI) -f https://api.github.com/repos/\(githubRepositoryPath)/branches | jq -r '.[] | .name')
277-
do
278-
for i in \(branchPatterns)
279-
do
280-
if [[ "$j" != $i ]]; then
281-
continue
282-
fi
283-
284-
echo Branch: $j
285-
sha=$(\(curlGitHubAPI) "https://api.github.com/repos/\(githubRepositoryPath)/commits/$j" | jq -r '.sha')
286-
echo Latest commit: $sha
287-
288-
echo "Trigger workflow on \(githubRepositoryPath)"
289-
\(curlGitHubAPI) --fail-with-body -X POST https://api.github.com/repos/\(githubRepositoryPath)/actions/workflows/\(trybot.key+workflowFileExtension)/dispatches -d "{\\"ref\\":\\"$j\\"}"
290-
291-
# Ensure that the trybot repo has the latest commit for
292-
# this branch. If the force-push results in a commit
293-
# being pushed, that will trigger the trybot workflows
294-
# so we don't need to do anything, otherwise we need to
295-
# trigger the most recent commit on that branch
296-
git remote -v
297-
git fetch origin refs/heads/$j
298-
git log -1 FETCH_HEAD
299-
300-
success=false
301-
for try in {1..20}; do
302-
echo "Push to trybot try $try"
303-
exitCode=0; push="$(git push -f trybot FETCH_HEAD:$j 2>&1)" || exitCode=$?
304-
echo "$push"
305-
if [[ $exitCode -eq 0 ]]; then
306-
success=true
307-
break
308-
fi
309-
sleep 1
310-
done
311-
if ! $success; then
312-
echo "Giving up"
313-
exit 1
314-
fi
315-
316-
if echo "$push" | grep up-to-date
317-
then
318-
# We are up-to-date, i.e. the push did nothing, hence we need to trigger a workflow_dispatch
319-
# in the trybot repo.
320-
echo "Trigger workflow on \(trybotRepositoryPath)"
321-
\(curlGitHubAPI) --fail-with-body -X POST https://api.github.com/repos/\(trybotRepositoryPath)/actions/workflows/\(trybot.key+workflowFileExtension)/dispatches -d "{\\"ref\\":\\"$j\\"}"
322-
else
323-
echo "Force-push to \(trybotRepositoryPath) did work; nothing to do"
324-
fi
325-
done
326-
done
327-
"""
328-
},
329-
]
330-
}
331-
}
332-
}
333-
334209
writeNetrcFile: githubactions.#Step & {
335210
name: "Write netrc file for \(botGerritHubUser) Gerrithub"
336211
run: """

internal/ci/base/github.cue

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ checkoutCode: {
145145

146146
earlyChecks: githubactions.#Step & {
147147
name: "Early git and code sanity checks"
148-
run: *"go run cuelang.org/go/internal/ci/checks@v0.11.0-0.dev.0.20240903133435-46fb300df650" | string
148+
run: *"go run cuelang.org/go/internal/ci/checks@v0.13.2" | string
149149
}
150150

151151
curlGitHubAPI: {

internal/ci/github/trybot.cue

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ workflows: trybot: _repo.bashWorkflow & {
2424
name: _repo.trybot.name
2525

2626
on: {
27+
schedule: [{cron: "0 2 * * *"}] // Run nightly at 2am UTC without a cache to catch flakes
2728
push: {
2829
branches: list.Concat([[_repo.testDefaultBranch], _repo.protectedBranchPatterns]) // do not run PR branches
2930
"tags-ignore": [_repo.releaseTagPattern]
@@ -55,10 +56,10 @@ workflows: trybot: _repo.bashWorkflow & {
5556

5657
for v in installGo {v},
5758

58-
// cachePre must come after installing Node and Go, because the cache locations
59-
// are established by running each tool.
6059
for v in _repo.setupGoActionsCaches {v & {
61-
if: string | *"\(matrixRunner) != '\(_repo.windowsMachine)'" // TODO(mvdan): remove the condition once Windows supports caching
60+
// We skip the cache entirely on the nightly runs, to catch flakes.
61+
// TODO(mvdan): remove the windowsMachine condition once Windows supports caching
62+
if: string | *"github.event_name != 'schedule' && \(matrixRunner) != '\(_repo.windowsMachine)'"
6263
}},
6364

6465
_repo.loginCentralRegistry,

0 commit comments

Comments
 (0)