🐟
A student from Harbin Institute of Technology
-
Harbin Institute of Technology
- Harbin
Highlights
- Pro
Pinned Loading
-
OpenBMB/CPM.cu
OpenBMB/CPM.cu PublicCPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge techniques in sparse architecture, speculative sampling and qua…
-
AI9Stars/SpecMQuant
AI9Stars/SpecMQuant PublicSpeculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
Cuda 15
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.