WFM-TTS: Test-Time Scaling for World Foundation Models 🚀

Official repository for the paper Can Test-Time Scaling Improve World Foundation Models?

🚀 News

[2025.07] Paper is accepted by COLM 2025!
[2025.04] Test-time scaling code released!
[2025.04] Project website is live!
[2025.03] Paper released on arXiv!

🔍 What is WFM-TTS?

WFM-TTS is the first test-time scaling framework for World Foundation Models (WFMs). Instead of retraining or enlarging models, WFM-TTS improves performance at inference time using smart generation strategies. It:

Enables small models (e.g., 4B) to match or outperform large models (e.g., 12B)
Works under the same compute budget
Requires no weight updates or additional training

📊 Evaluation Toolkit

We introduce a modular and extensible evaluation toolkit to assess WFM performance across:

✔ 3D consistency
✔ Temporal consistency
✔ Spatial relationship awareness
✔ Perceptual quality
✔ Text-to-video alignment

This toolkit provides rigorous benchmarking for generated videos across physical and semantic fidelity.

🔧 Key Techniques in WFM-TTS

WFM-TTS integrates multiple test-time strategies to boost performance:

Rule-Based Rewards — Robust and extensible scoring mechanisms
Efficient Tokenizer Decoder — 9,000x faster than diffusion decoder with consistent trends
Probability-Based Top-K Pruning — Balances exploration and quality
Beam Search Integration — Enhances diversity and reliability

📊 Results at a Glance

A 4B WFM + WFM-TTS achieves better or equal performance to a 12B model
Human evaluations favor WFM-TTS-enhanced outputs over larger baselines

🚧 Installation

Environment Setup

git clone https://github.com/Mia-Cong/WFM-TTS.git
cd WFM-TTS

Base model Cosmos runs only on Linux systems. It has been tested with Ubuntu 20.04, 22.04, and 24.04. Python 3.10.x and conda are required.

Inference Setup

Run the following to set up the conda environment and install dependencies:

# Create the WFM-TTS conda environment
conda env create --file wfmtts.yaml

# Activate the environment
conda activate wfmtts

# Install Python dependencies
pip install -r requirements.txt

# Patch Transformer engine linking issues in conda environments
ln -sf $CONDA_PREFIX/lib/python3.10/site-packages/nvidia/*/include/* $CONDA_PREFIX/include/
ln -sf $CONDA_PREFIX/lib/python3.10/site-packages/nvidia/*/include/* $CONDA_PREFIX/include/python3.10

# Install Transformer engine
pip install transformer-engine[pytorch]==1.12.0

To test the environment setup:

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python scripts/test_environment.py

🧪 Test-Time Scaling

To reproduce WFM-TTS's test-time scaling results:

1. Download Pretrained Models of Base WFM model COSMOS

Follow cosmos_inference_autoregressive_base to set up COSMOS 4B/12B models.
Follow cosmos_inference_autoregressive_video2world for COSMOS 5B/13B models.

2. Download Evaluation Dataset

Download the 900 autonomous driving test sequences we prepared using NuScenes dataset and Waymo dataset, and put it under assets/autoregressive/

3. Run WFM-TTS Scripts

Run any of the provided test-time scaling scripts located in scripts/, for example:

./scripts/cosmos4b_prob_beam.sh

✅ Citation

If you find this work useful, please cite:

@inproceedings{cong2025wfm-tts,
  title     = {Can Test-Time Scaling Improve World Foundation Models?},
  author    = {Wenyan Cong and Hanqing Zhu and Peihao Wang and Bangya Liu and Dejia Xu and Kevin Wang and David Z. Pan and Yan Wang and Zhiwen Fan and Zhangyang Wang},
  booktitle = {COLM},
  year      = {2025}
}

🔗 Related Resources

COSMOS - World Foundation Model
VBench - Benchmark for video generation models

For more updates and demos, visit our website: https://scalingwfm.github.io

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
checkpoints		checkpoints
cosmos_predict1		cosmos_predict1
datasets		datasets
examples		examples
figs		figs
scripts		scripts
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
README.md		README.md
wfmtts.yaml		wfmtts.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WFM-TTS: Test-Time Scaling for World Foundation Models 🚀

🚀 News

🔍 What is WFM-TTS?

📊 Evaluation Toolkit

🔧 Key Techniques in WFM-TTS

📊 Results at a Glance

🚧 Installation

Environment Setup

Inference Setup

🧪 Test-Time Scaling

1. Download Pretrained Models of Base WFM model COSMOS

2. Download Evaluation Dataset

3. Run WFM-TTS Scripts

✅ Citation

🔗 Related Resources

About

Uh oh!

Releases

Packages

Languages

VITA-Group/WFM-TTS

Folders and files

Latest commit

History

Repository files navigation

WFM-TTS: Test-Time Scaling for World Foundation Models 🚀

🚀 News

🔍 What is WFM-TTS?

📊 Evaluation Toolkit

🔧 Key Techniques in WFM-TTS

📊 Results at a Glance

🚧 Installation

Environment Setup

Inference Setup

🧪 Test-Time Scaling

1. Download Pretrained Models of Base WFM model COSMOS

2. Download Evaluation Dataset

3. Run WFM-TTS Scripts

✅ Citation

🔗 Related Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages