Skip to content
Change the repository type filter

All

    Repositories list

    • LENS

      Public
      [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
      Python
      06180Updated Nov 21, 2025Nov 21, 2025
    • EVA-X

      Public
      [npjDigitalMed (Nature Portfolio)] EVA-X: A foundation model for general chest X-ray analysis with self-supervised learning
      Python
      98550Updated Nov 18, 2025Nov 18, 2025
    • MolSight

      Public
      [AAAI 2026] MolSight: Optical Chemical Structure Recognition with SMILES Pretraining, Multi-Granularity Learning and Reinforcement Learning
      Python
      0200Updated Nov 16, 2025Nov 16, 2025
    • SuperCLIP

      Public
      Python
      01600Updated Nov 10, 2025Nov 10, 2025
    • [AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
      Python
      06980Updated Nov 8, 2025Nov 8, 2025
    • RAD

      Public
      [NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
      Python
      312540Updated Nov 7, 2025Nov 7, 2025
    • [CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"
      Python
      211530Updated Oct 23, 2025Oct 23, 2025
    • HTML
      2100Updated Oct 11, 2025Oct 11, 2025
    • VAD

      Public
      [ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
      Python
      1321.1k751Updated Sep 28, 2025Sep 28, 2025
    • GaussTR

      Public
      [CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
      Python
      918420Updated Sep 22, 2025Sep 22, 2025
    • TOGS

      Public
      [IEEE JBHI] The official code of "TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA Rendering"
      Python
      22910Updated Sep 10, 2025Sep 10, 2025
    • simpleseg

      Public
      Python
      0820Updated Sep 9, 2025Sep 9, 2025
    • Snap-Snap

      Public
      The repository of "Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds"
      Python
      13640Updated Sep 1, 2025Sep 1, 2025
    • ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
      Python
      31700Updated Aug 21, 2025Aug 21, 2025
    • ViTMatte

      Public
      [Information Fusion (Vol.103, Mar. '24)] Boosting Image Matting with Pretrained Plain Vision Transformers
      Python
      44484193Updated Aug 13, 2025Aug 13, 2025
    • [ACM MM 2025] Dynamic 2D Gaussians: Geometrically Accurate Radiance Fields for Dynamic Objects
      Python
      615730Updated Aug 6, 2025Aug 6, 2025
    • .github

      Public
      0000Updated Jul 4, 2025Jul 4, 2025
    • [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
      Python
      16920Updated Jun 26, 2025Jun 26, 2025
    • [CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving
      Python
      921.1k221Updated Jun 17, 2025Jun 17, 2025
    • [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
      Python
      421.3k161Updated Jun 12, 2025Jun 12, 2025
    • PersonViT

      Public
      PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification
      Python
      53720Updated Jun 11, 2025Jun 11, 2025
    • MIM4D

      Public
      [IJCV 2025] MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
      Python
      17220Updated May 30, 2025May 30, 2025
    • PixelHacker: Image Inpainting with Structural and Semantic Consistency
      Python
      20456121Updated May 20, 2025May 20, 2025
    • mmMamba

      Public
      The first decoder-only multimodal state space model
      Python
      39740Updated May 19, 2025May 19, 2025
    • MaTVLM

      Public
      Python
      55330Updated May 13, 2025May 13, 2025
    • OmniMamba

      Public
      OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
      Python
      514040Updated Apr 25, 2025Apr 25, 2025
    • ControlAR

      Public
      [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models
      Python
      10311120Updated Apr 24, 2025Apr 24, 2025
    • WeakSAM

      Public
      [ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
      Python
      15750Updated Apr 8, 2025Apr 8, 2025
    • Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning
      Python
      15298130Updated Mar 26, 2025Mar 26, 2025
    • EVF-SAM

      Public
      Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
      Python
      22489140Updated Mar 17, 2025Mar 17, 2025