-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Create alternative requirements.txt with AMD and Metal wheels #4052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Only thing that stands out to me is that the webui currently uses cuBLAS wheels for llama-cpp-python under the name of All of the All of the typical llama-cpp-python wheels I have built are here: https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels I'm not sure what the performance differences between the AVX versions are and whether it is worth using the As for the issue with the Linux torch versioning, I'm not sure what to do about it beyond using this command to install it: This works on my WSL install. The import torch
is_cuda = torch.version.cuda is not None
is_rocm = torch.version.hip is not None
is_cpu = not (is_cuda or is_rocm or is_intel)An alternative to the above is to set the I downloaded the new Linux CUDA wheel and verified that it does have |
|
I tested AVX2 vs no AVX2 as you suggested, and to my surprise, AVX2 is consistently slower: So I just dumped AVX2 everywhere. I'll double check this on another computer before merging to confirm. For Mac I just arbitrarly picked the 11_0 wheels and will hope for the best (backward compatibility?). The |
Nice! I have made this change, that's indeed much cleaner. |
|
For llama-cpp-python built with cuBLAS, most of the intensive math operations take place on the GPU. In that case, AVX doesn't really matter much. It's mostly only relevant for CPU-only builds, though I haven't tested speeds myself. Not sure how relevant it is, but the |
This may be related to the performance degradation that I found on my laptop (i5-10300H CPU): I ended up just adding back the AVX2 check and creating |
This is an attempt at making the one-click installer more universal.
requirements_amd.txtwith all the AMD wheels that I could find, replacing the manual install commands in one_click.py.requirements_nowheels.txttorequirements_minimal.txtand added the AVX2 version of llama-cpp-python there.requirements_minimal_noavx2.txt, identical to the previous one but with the llama-cpp-python wheels without AVX2 by @jllllll.requirements_mac.txtwith the only wheel with "mac" in its name that I could find (for llama-cpp-python). I don't know if it's useful.requirements_amd.txtwhen+rocmis present,requirements_minimalwhen+cpuis present, andrequirements.txtotherwise.An open question is whether it makes sense to have avx2 and no_avx2 versions of each requirements.txt. If so, there will be at least 6 to 8 requirements.txt in the end.
Also, one issue I found is that on Linux + CUDA,
pip show torchcurrently returns the following string that has no mention of+cuin the Version line, causingis_cudato be set to False.@jllllll what do you think of these changes?