-
Notifications
You must be signed in to change notification settings - Fork 4
Support Llama MoE model #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| s += f", reduce_results={self.reduce_results}" | ||
| return s | ||
|
|
||
| class TritelaLinear(LinearBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call it SparseQuantizedLinear instead of TriteiaLinear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, do you really need this class? It seems it is not used anyway in the model code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
9a74856 to
7c327f4
Compare
|
Marking the PR as draft, as the code for the unquantised MoE may be incorrect |
No description provided.