GitHub - nvidia-cosmos/cosmos-reason1: Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

Paper | Website | HuggingFace

NVIDIA Cosmos Reason – an open, customizable, 7B-parameter reasoning vision language model (VLM) for physical AI and robotics - enables robots and vision AI agents to reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the real world. This model understands space, time, and fundamental physics, and can serve as a planning model to reason what steps an embodied agent might take next. Cosmos Reason excels at navigating the long tail of diverse scenarios of the physical world with spatial-temporal understanding. Cosmos Reason is post-trained with physical common sense and embodied reasoning data with supervised fine-tuning and reinforcement learning. It uses chain-of-thought reasoning capabilities to understand world dynamics without human annotations.

News

2025-08-08: We added the cosmos-reason1-utils inference utilities package. Adds spatial-temporal reasoning inference. See Inference for example usage.
2025-08-1: We added support for spatial-temporal reasoning for city and industrial operations. See latest checkpoint Cosmos-Reason1-7B.
2025-06-11: We enhance the model’s capability on judging the physical plausibility of a video. See this tutorial for details.
2025-05-17: We release model weights and training data under Hugging Face.

Model

Cosmos-Reason1-7B

Setup

Install system dependencies:

pkgx

brew install pkgx || curl https://pkgx.sh | sh

uv

curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

Hugging Face CLI

uv tool install -U "huggingface_hub[cli]"
hf auth login

Clone the repository:

git clone https://github.com/nvidia-cosmos/cosmos-reason1.git
cd cosmos-reason1

Inference

Minimum Requirements:

1 GPU with 24GB memory

Cosmos-Reason1 is included in transformers>=4.51.3.

We provide example inference scripts:

Minimal example
```
./scripts/inference_sample.py
```

Full example

Caption the video:

./scripts/inference.py --prompt prompts/caption.yaml --videos assets/sample.mp4 -v

Ask a question about the video with reasoning:

./scripts/inference.py --prompt prompts/question.yaml --question 'What are the potential safety hazards?' --reasoning --videos assets/sample.mp4 -v

Temporally caption the video and save the input frames to outputs/temporal_caption_text for debugging:

./scripts/inference.py --prompt prompts/temporal_caption_text.yaml --videos assets/sample.mp4 --timestamp -v -o outputs/temporal_caption_text

Configure inference by editing:

Tutorials

Post-Training

The nvidia-cosmos/cosmos-rl repository is an async post-training framework specialized for Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). It prioritizes performance, scalability, and fault tolerance.

To support a custom dataset format, use the minimal Hugging Face example as a template.

Additional Resources

The Cosmos-Reason1 model is based on the Qwen2.5-VL model architecture. Useful resources:

Repository

License and Contact

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

NVIDIA Cosmos source code is released under the Apache 2 License.

NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.github		.github
assets		assets
bin		bin
configs		configs
cosmos_reason1_utils		cosmos_reason1_utils
examples		examples
prompts		prompts
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
.link-check.json		.link-check.json
.pre-commit-config-base.yaml		.pre-commit-config-base.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
ATTRIBUTIONS.md		ATTRIBUTIONS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
justfile		justfile
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Paper | Website | HuggingFace

News

Model

Setup

Inference

Tutorials

Post-Training

Additional Resources

License and Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 26

Languages

License

nvidia-cosmos/cosmos-reason1

Folders and files

Latest commit

History

Repository files navigation

Paper | Website | HuggingFace

News

Model

Setup

Inference

Tutorials

Post-Training

Additional Resources

License and Contact

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 26

Languages

Packages