Skip to content

Conversation

@AshAnand34
Copy link

@AshAnand34 AshAnand34 commented May 7, 2025

Integrated DeepSeek R1 model as per #56

@laggui
Copy link
Member

laggui commented May 7, 2025

Haven't looked at the code thoroughly, but I see you've added some files to extend to training / fine-tuning. This would be great!

This might require a bit more effort, so if you want to start and divide the PR the first step could be to have the architecture implemented and import the pre-trained weights for inference 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants