Skip to content
This repository was archived by the owner on May 11, 2025. It is now read-only.

Conversation

IlyasMoutawwakil
Copy link
Contributor

@casper-hansen
Copy link
Owner

I have a few questions about the ExLlama kernels.

  1. Is it necessary to include v1 and v2, why not just v2 (seems they are better)?
  2. Are these kernels updated with the latest commits from ExLlamaV2?

@IlyasMoutawwakil
Copy link
Contributor Author

  1. I think it's always best to have a fallback in case exllamav2 doesn't work or build, an example is exllamav2 not building on rocm6.0 https://github.com/AutoGPTQ/AutoGPTQ/pull/515/files
  2. Not sure, the ones in the original repo contain a lot of unused components (e.g. moe specific stuff). I took the one in AutoGPTQ for now, which I believe to be minimal and fixed for AutoAWQ (bit overflow from removing and adding 1 to zeros).

@casper-hansen casper-hansen merged commit fc700a8 into casper-hansen:main Jan 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants