GitHub - yanglicb/FastVGGT: Code for FastVGGT: Training-Free Acceleration of Visual Geometry Transformer

⚡️ FastVGGT: Training-Free Acceleration of Visual Geometry Transformer

Media Analytics & Computing Laboratory; AUTOLAB

You Shen, Zhipeng Zhang, Yansong Qu, Liujuan Cao

📰 News

[Sep 3, 2025] Paper release.
[Sep 2, 2025] Code release.

🔭 Overview

FastVGGT observes strong similarity in attention maps and leverages it to design a training-free acceleration method for long-sequence 3D reconstruction, achieving up to 4× faster inference without sacrificing accuracy.

⚙️ Environment Setup

First, create a virtual environment using Conda, clone this repository to your local machine, and install the required dependencies.

conda create -n fastvggt python=3.10
conda activate fastvggt
git clone [email protected]:mystorm16/FastVGGT.git
cd FastVGGT
pip install -r requirements.txt

Next, prepare the ScanNet dataset: http://www.scan-net.org/ScanNet/

Then, download the VGGT checkpoint (we use the checkpoint link provided in https://github.com/facebookresearch/vggt/tree/evaluation/evaluation):

wget https://huggingface.co/facebook/VGGT_tracker_fixed/resolve/main/model_tracker_fixed_e20.pt

Finally, configure the dataset path and VGGT checkpoint path. For example:

    parser.add_argument(
        "--data_dir", type=Path, default="/data/scannetv2/process_scannet"
    )
    parser.add_argument(
        "--gt_ply_dir",
        type=Path,
        default="/data/scannetv2/OpenDataLab___ScanNet_v2/raw/scans",
    )
    parser.add_argument(
        "--ckpt_path",
        type=str,
        default="./ckpt/model_tracker_fixed_e20.pt",
    )

💎 Observation

Note: A large number of input_frames may significantly slow down saving the visualization results. Please try using a smaller number first.

python eval_scannet.py --input_frame 30 --vis_attn_map

We observe that many token-level attention maps are highly similar in each block, motivating our optimization of the Global Attention module.

🏀 Evaluation

Evaluate FastVGGT on the ScanNet dataset with 1,000 input images. The --merging parameter specifies the block index at which the merging strategy is applied:

python eval_scannet.py --input_frame 1000 --merging 0

Evaluate Baseline VGGT on the ScanNet dataset with 1,000 input images:

python eval_scannet.py --input_frame 1000

🍺 Acknowledgements

Thanks to these great repositories: VGGT, Dust3r, Fast3R, CUT3R, MV-DUSt3R+, StreamVGGT, VGGT-Long and many other inspiring works in the community.
Special thanks to Jianyuan Wang for his valuable discussions and suggestions on this work.

✍️ Checklist

Release the evaluation code on 7 Scenes / NRGBD

⚖️ License

The FastVGGT codebase follows VGGT's license, please refer to LICENSE for applicable terms.

Please note that only this model checkpoint allows commercial usage.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
merging		merging
vggt		vggt
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
demo_gradio.py		demo_gradio.py
eval_scannet.py		eval_scannet.py
requirements.txt		requirements.txt
requirements_demo.txt		requirements_demo.txt
visual_util.py		visual_util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚡️ FastVGGT: Training-Free Acceleration of Visual Geometry Transformer

📰 News

🔭 Overview

⚙️ Environment Setup

💎 Observation

🏀 Evaluation

🍺 Acknowledgements

✍️ Checklist

⚖️ License

About

Uh oh!

Releases

Packages

Languages

License

yanglicb/FastVGGT

Folders and files

Latest commit

History

Repository files navigation

⚡️ FastVGGT: Training-Free Acceleration of Visual Geometry Transformer

📰 News

🔭 Overview

⚙️ Environment Setup

💎 Observation

🏀 Evaluation

🍺 Acknowledgements

✍️ Checklist

⚖️ License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages