🎯
    Focusing
    Pinned Loading
- 
  mit-han-lab/hartmit-han-lab/hart PublicHART: Efficient Visual Generation with Hybrid Autoregressive Transformer 
- 
  mit-han-lab/llm-awqmit-han-lab/llm-awq Public[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration 
- 
  mit-han-lab/omniservemit-han-lab/omniserve Public[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention 
- 
  mit-han-lab/spvnasmit-han-lab/spvnas Public archive[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution 
- 
  mit-han-lab/bevfusionmit-han-lab/bevfusion Public archive[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation 
- 
  mit-han-lab/torchsparsemit-han-lab/torchsparse Public[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs. 
          Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
  If the problem persists, check the GitHub status page or contact support.




