-
Notifications
You must be signed in to change notification settings - Fork 146
Expand check for libraries provided by the host #2077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand check for libraries provided by the host #2077
Conversation
In addition to not generating runtime dependencies or provides for libcuda.so.1 we also do not want to create them for libnvcuvid.so.1 and libnvidia-encode.so.1.
3253280
to
10c1236
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the reason we want to filter out these libraries is different than with libcuda.so.1
. The reason we want to filter these additional libraries has nothing to do with the whole host passthrough business. It is because there are multiple packages that provide the same soname (one in each -cuda-X variant), and every package that depends on libnvcuvid.so.1
should really just get 1 specific one.
So if we build a binary cuda-app-cuda-12.6
that links against libnvcuvid.so.1
, we want it to always get nvidia-libnvcuvid-12.6
. Not nvidia-libnvcuvid-11.8
, and not libnvcuvid-12.9
. In theory, we could add an additional explicit dependency on nvidia-libnvcuvid-12.6
and the resolver would be able to figure out that the only way to resolve both the sca-added so:libnvcuvid.so.1
and nvidia-libnvcuvid-12.6
is to use nvidia-libnvcuvid-12.6
for both. But apko
's resolver doesn't work that way. Instead, apko
's resolver will find some package that satisfies so:libnvcuvid.so.1
. Once it finds one, that's it - it's resolved. Let's say it fixated on nvidia-libnvcuvid-11.8
. Later, it will try to resolve nvidia-libnvcuvid-12.6
. Well, that won't work. nvidia-libnvcuvid-12.6
conflicts with nvidia-libnvcuvid-11.8
. apko
will then just refuse to continue. Slumber party over, I'm calling mom to pick me up.
We should probably keep these soname lists logically separate, with separate comments explaining their purposes. Maybe a isHostProvidedLibrary()
and a isLibraryWithMultipleVariants()
? fwiw - I think there are just 2 libs in the libcuda.so.1
category - the other being libnvidia-ml.so.1
. While there are a ton of libraries in the libnvcuvid
bucket.
@dannf thank you for that really well written comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love your commit message that explains the details about where this list comes from. One minor / optional suggestion would be to also note how one can repeat the process you used to generate this list, in case it needs updating in the future.
Thanks - but, FTR, it turns out to be (no longer?) correct. I attempted to reproduce that scenario and, at least today, (I had wondered why some packages had |
nvidia-container-toolkit will provide all of these libraries to a container if the NVIDIA_DRIVER_CAPABILITIES=all environment variable is set. To avoid conflicts with the host let's not generate provides for any of them. The list of libraries was generated by installing nvidia-container-toolkit 1.17.8-1 on an Ubuntu 24.04 system with an NVIDIA GPU and then running the Chainguard bash docker container with `-e NVIDIA_DRIVER_CAPABILITIES=all --gpus all` and checking /usr/lib/ for all libraries with the same version number as the NVIDIA drivers installed on the host.
a604dec
to
781bd2b
Compare
In addition to not generating runtime dependencies or provides for libcuda.so.1 we also do not want to create them for libnvcuvid.so.1 and libnvidia-encode.so.1.