ModelTC
Pinned Loading
Repositories
- SageAttention-1104 Public Forked from thu-ml/SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
ModelTC/SageAttention-1104’s past year of commit activity - LightCompress Public
A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
ModelTC/LightCompress’s past year of commit activity - LightLLM Public
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
ModelTC/LightLLM’s past year of commit activity - FlashVSR Public Forked from OpenImagingLab/FlashVSR
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.
ModelTC/FlashVSR’s past year of commit activity - ComfyUI-LightVAE Public
ModelTC/ComfyUI-LightVAE’s past year of commit activity - LightKernel Public
ModelTC/LightKernel’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…