-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Labels
new-modelRequests to new modelsRequests to new modelsunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale
Description
The model to consider.
Mamba Codestral: https://huggingface.co/mistralai/mamba-codestral-7B-v0.1
Highlights:
- SOTA 7B code model
- theoretically unlimited context length; tested up to 256k
- inference is linear-complexity with respect to sequence length, compared to transformers which is quadratic-complexity
The closest model vllm already supports.
Jamba seems to be the closest model, since it is Mamba-based: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/jamba.py
What's your difficulty of supporting the model you want?
Mamba is a non-transformer architecture, but there is already a mamba-based model supported, so it's unclear how difficult it would be to support.
mgoin, tlrmchlsmth, JakubCerven, nfplay, gbxfhebert and 6 morepetercunha and ruipeterpan
Metadata
Metadata
Assignees
Labels
new-modelRequests to new modelsRequests to new modelsunstaleRecieved activity after being labelled staleRecieved activity after being labelled stale