SemNav: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation

Overview

SemNav is a visual semantic navigation model ready to be deployed into any robot. It achieves successful object goal navigations using mainly semantic segmentation information.

In this repository we release the SemNav dataset, code and trained models detailed in our [paper].

If you use any content of this repo for your work, please cite the following bib entry:

@article{semnav,
author={Flor-Rodr{\'i}guez, Rafael and Guti{\'e}rrez-{\'A}lvarez, Carlos and Acevedo-Rodr{\'i}guez, Francisco~J. and Lafuente-Arroyo, Sergio and L{\'o}pez-Sastre, Roberto~J.},
title={SEMNAV: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation},
journal={ArXiv},
year={2025},
month={June},
day={02},    
doi={10.48550/arXiv.2506.01418},
url={https://doi.org/10.48550/arXiv.2506.01418}
}

Install locally

To run our code you need a machine that runs Ubuntu in order to install all the dependencies. We have tested our code on Ubuntu 20.04, 22.04 and 24.04. The most easy way is to install miniconda (if you don't already have it). You can download it from here.

Once you have installed miniconda, you can setup the environment by running the following script we prepared:

bash scripts/setup_environment.sh

If you want to install the dependencies manually, you can follow the instructions below.

Manual installation (Click to expand/collapse)

Clone the repository and set up the environment:

git clone https://github.com/gramuah/semnav.git
conda create -n semnav python=3.9 cmake=3.18.0
conda activate semnav

Install Habitat-Sim

git clone --depth 1 --branch v0.2.2 https://github.com/facebookresearch/habitat-sim.git
cd habitat-sim/
pip install -r requirements.txt
python setup.py install --headless
cd ..

Install torch

pip3 install torch torchvision torchaudio

Install Habitat-Lab

pip install gym==0.22.0 urllib3==1.25.11 numpy==1.25.0 pillow==9.2.0
git clone https://github.com/carlosgual/habitat-lab.git
cd habitat-lab/
python setup.py develop --install
cd ..

Install other dependencies

pip install wandb
conda install protobuf

Install semnav

pip insatll -e .

Data setup

We provide two datasets, SemNav 40 and SemNav 1630, for leveraging semantic segmentation information:

SemNav 1630: Built using human-annotated semantic labels from HM3D Semantics.
SemNav 40: Derived by mapping these annotations to the 40 categories of NYUv2.

Dataset	Download Link
SemNav 40	Download
SemNav 1630	Download

Additionally, download the ObjectNav HM3D episode dataset from this link.

Docker Setup

If you want to run the code in a Docker container (for example to run it into a compute server as we do), follow the instructions below. You will need a docker installation with GPU support. We also use rootless containers, which means that the container shares the same user as the host. That is why first of all you need to put your user name and user id in the Dockerfile (lines 42-43). You can get your user id by running id -u.

Build the Docker Image

docker build -t semnav:latest -f docker/Dockerfile .

This builds the Docker image with the entrypoint prepared to run the training script. You can modify the entrypoint to run other scripts, for example the evaluation script, but you will need to rebuild the image.

Run the Docker Container

docker run \ 
  -v /home/your_username/local_path_to_your_data/:/home/your_username/code/data \ # mount the data folder
  -v /home/your_username/local_path_to_your_code/semnav:/home/your_username/code \ # mount the code folder (so you can modify the code locally and still deploy it via docker)
  --env NVIDIA_VISIBLE_DEVICES=5,6 \ # If you want to use specific GPUs on multi-GPU systems
  --env WANDB_API_KEY=your_api_key \ # If you want to use wandb, if not ignore
  --name semnav_container \
  --runtime=nvidia \
  semnav

Access the Running Container

docker exec -it semnav_container /bin/bash

Stop the Container

docker stop semnav_container

The Dockerfile sets up the complete environment, including:

CUDA and cuDNN for GPU support
Conda for environment management
Habitat-Sim and Habitat-Lab for simulation tasks
Essential Python libraries: PyTorch, torchvision, torchaudio

Ensure the entry script entrypoint.sh is executable.

Pretrained Checkpoints

We provide multiple trained configurations. We provide checkpoints for the SemNav 40 dataset in three setups:

RGBS (IL): Trained using imitation learning (IL) with RGB and semantic segmentation inputs.
RGBS (IL+RL): Trained using a combination of imitation learning (IL) and reinforcement learning (RL) with RGB and semantic segmentation inputs.
OS: Trained using only semantic segmentation inputs.

Configuration	Description	Download Link
RGBS (IL)	RGB + Semantic Segmentation (IL)	Download
RGBS (IL+RL)	RGB + Semantic Segmentation (IL + RL)	Download
OS	Only Semantic Segmentation	Download

Training

To train a model from scratch, run (with the conda environment activated):

bash scripts/launch_training.sh

The training dataset is available in the PirlNav repository.

Modify the training configuration in:

configs/experiments/il_objectnav.yaml

Policy Options

SEMANTIC_ObjectNavILMAEPolicy: Uses only semantic segmentation.
SEMANTIC_RGB_ObjectNavILMAEPolicy: Uses both semantic segmentation and RGB.
RGB_ObjectNavILMAEPolicy: Uses only RGB.

Pretrained visual encoder weights can be downloaded from the PirlNav repository.

Our Imitation Learning training parameters

Parameter	Value
Number of GPUs	8
Number of environments per GPU	16
Rollout length	64
Number of mini-batches per epoch	2
Optimizer	Adam
Learning rate scheduler	Cyclic LR (exp_range)
Base learning rate	1×10⁻⁵
Maximum learning rate	1×10⁻³
Step size up	2000
Exponential decay factor (γ)	0.99994
DDPIL sync fraction	0.6

Evaluation

Run the evaluation with (with the conda environment activated):

bash scripts/launch_eval.sh

To evaluate pretrained models, select a checkpoint from pretrained_ckpt.

For further information, refer to our paper or visit our Group Page.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
configs		configs
docker		docker
imgs		imgs
scripts		scripts
semnav		semnav
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run.py		run.py
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SemNav: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation

Overview

Install locally

Install Habitat-Sim

Install torch

Install Habitat-Lab

Install other dependencies

Install semnav

Data setup

Docker Setup

Build the Docker Image

Run the Docker Container

Access the Running Container

Stop the Container

Pretrained Checkpoints

Training

Policy Options

Our Imitation Learning training parameters

Evaluation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

gramuah/semnav

Folders and files

Latest commit

History

Repository files navigation

SemNav: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation

Overview

Install locally

Install Habitat-Sim

Install torch

Install Habitat-Lab

Install other dependencies

Install semnav

Data setup

Docker Setup

Build the Docker Image

Run the Docker Container

Access the Running Container

Stop the Container

Pretrained Checkpoints

Training

Policy Options

Our Imitation Learning training parameters

Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages