Skip to content

[cache] Fix descriptor duplication in getAvailableBlobs #6092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Fricounet
Copy link

The walkBlob function was called with a function having a side effect: appending a descriptor to a list. However, the function could be called multiple times for the same descriptor (once in walkBlob and once in walkBlobVariantsOnly in that case) leading to duplicate entries in the final list. Instead of the list, we can use a map to store the descriptors. With the digest as key, it will ensure we don't store duplicates.

This fixes one of the problem encountered in #6088

cc @ktock

@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 92d0f5a to 34ade39 Compare July 17, 2025 08:59
Copy link
Collaborator

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Please squash the commits.

The walkBlob function was called with a function having a side effect: appending a descriptor to a list. However, the function could be called multiple times for the same descriptor (once in `walkBlob` and once in `walkBlobVariantsOnly` in that case) leading to duplicate entries in the final list.
Instead of the list, we can use a map to store the descriptors. With the digest as key, it will ensure we don't store duplicates.

Signed-off-by: Baptiste Girard-Carrabin <[email protected]>
@Fricounet Fricounet force-pushed the fricounet/upstream/fix-digest-duplication branch from 30b33bc to 0858b9e Compare July 17, 2025 11:37
@Fricounet
Copy link
Author

@ktock done, thanks for the review 🙇

@Fricounet
Copy link
Author

@ktock any idea why the tests would be failing? I'm don't really understand the errors + it was not failing before I sqashed 🤔

@ktock
Copy link
Collaborator

ktock commented Jul 18, 2025

TestGetRemotes failure doesn't look like caused by this change (posted a possible fix #6099). I'm rerunning the test.

https://github.com/moby/buildkit/actions/runs/16344069969/job/46175402958?pr=6092#step:8:4328

=== FAIL: cache TestGetRemotes (1.05s)
    manager_test.go:1783: 
        	Error Trace:	/src/cache/manager_test.go:1783
        	            				/src/vendor/golang.org/x/sync/errgroup/errgroup.go:130
        	            				/usr/local/go/src/runtime/asm_amd64.s:1700
        	Error:      	An error is expected but got nil.
        	Test:       	TestGetRemotes

@Fricounet
Copy link
Author

Another failure but it also looks unrelated

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I look at the implementation of walkBlob it is calling walkBlobVariantsOnly with visited map to avoid duplications. Isn't that one working correctly, and if it is not then probably should fix it there.

@Fricounet
Copy link
Author

@tonistiigi it doesn't work as expected because:
in a case with 2 compressions (gzip and zstd) where the gzip image was built first, then the zstd one and you decide to rebuild + export cache for the gzip image:

  • f is called once in walkBlob for the gzip layer
  • on the first iteration of walkBlobVariantOnly, f is called once for the zstd layer here
  • then walkBlobVariantOnly is called again with the zstd layer (which will have the gzip layer in its content store info because it was built after the initial gzip image) which means that f will be called a third time for the gzip layer there because the visited check only occur after

I thought about fixing by moving the visited check to about here however this would mean that walkBlobVariantOnly never calls f for the inital blob which is a breaking change and It means we need to consider the consequences for all the cases where f is different when calling walkBlobVariantOnly and if it has, or not, side effects.
So I felt that just making sure that there can't be duplicates in the end result of the function used to call walkBlob (by using a map) is safer as it means we don't have to care about how many times the function gets called.
What do you think?

@Fricounet
Copy link
Author

@tonistiigi do you mind giving me your updated opinion based on my message above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants