Skip to content

gramuah/semnav

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SemNav: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation

Overview

SemNav is a visual semantic navigation model ready to be deployed into any robot. It achieves successful object goal navigations using mainly semantic segmentation information.

In this repository we release the SemNav dataset, code and trained models detailed in our [paper].

If you use any content of this repo for your work, please cite the following bib entry:

@article{semnav,
author={Flor-Rodr{\'i}guez, Rafael and Guti{\'e}rrez-{\'A}lvarez, Carlos and Acevedo-Rodr{\'i}guez, Francisco~J. and Lafuente-Arroyo, Sergio and L{\'o}pez-Sastre, Roberto~J.},
title={SEMNAV: A Semantic Segmentation-Driven Approach to Visual Semantic Navigation},
journal={ArXiv},
year={2025},
month={June},
day={02},    
doi={10.48550/arXiv.2506.01418},
url={https://doi.org/10.48550/arXiv.2506.01418}
}    

Install locally

To run our code you need a machine that runs Ubuntu in order to install all the dependencies. We have tested our code on Ubuntu 20.04, 22.04 and 24.04. The most easy way is to install miniconda (if you don't already have it). You can download it from here.

Once you have installed miniconda, you can setup the environment by running the following script we prepared:

bash scripts/setup_environment.sh

If you want to install the dependencies manually, you can follow the instructions below.

Manual installation (Click to expand/collapse)

Clone the repository and set up the environment:

git clone https://github.com/gramuah/semnav.git
conda create -n semnav python=3.9 cmake=3.18.0
conda activate semnav

Install Habitat-Sim

git clone --depth 1 --branch v0.2.2 https://github.com/facebookresearch/habitat-sim.git
cd habitat-sim/
pip install -r requirements.txt
python setup.py install --headless
cd ..

Install torch

pip3 install torch torchvision torchaudio

Install Habitat-Lab

pip install gym==0.22.0 urllib3==1.25.11 numpy==1.25.0 pillow==9.2.0
git clone https://github.com/carlosgual/habitat-lab.git
cd habitat-lab/
python setup.py develop --install
cd ..

Install other dependencies

pip install wandb
conda install protobuf

Install semnav

pip insatll -e .

Data setup

We provide two datasets, SemNav 40 and SemNav 1630, for leveraging semantic segmentation information:

  • SemNav 1630: Built using human-annotated semantic labels from HM3D Semantics.
  • SemNav 40: Derived by mapping these annotations to the 40 categories of NYUv2.
Dataset Download Link
SemNav 40 Download
SemNav 1630 Download

Additionally, download the ObjectNav HM3D episode dataset from this link.


Docker Setup

If you want to run the code in a Docker container (for example to run it into a compute server as we do), follow the instructions below. You will need a docker installation with GPU support. We also use rootless containers, which means that the container shares the same user as the host. That is why first of all you need to put your user name and user id in the Dockerfile (lines 42-43). You can get your user id by running id -u.

Build the Docker Image

docker build -t semnav:latest -f docker/Dockerfile .

This builds the Docker image with the entrypoint prepared to run the training script. You can modify the entrypoint to run other scripts, for example the evaluation script, but you will need to rebuild the image.

Run the Docker Container

docker run \ 
  -v /home/your_username/local_path_to_your_data/:/home/your_username/code/data \ # mount the data folder
  -v /home/your_username/local_path_to_your_code/semnav:/home/your_username/code \ # mount the code folder (so you can modify the code locally and still deploy it via docker)
  --env NVIDIA_VISIBLE_DEVICES=5,6 \ # If you want to use specific GPUs on multi-GPU systems
  --env WANDB_API_KEY=your_api_key \ # If you want to use wandb, if not ignore
  --name semnav_container \
  --runtime=nvidia \
  semnav

Access the Running Container

docker exec -it semnav_container /bin/bash

Stop the Container

docker stop semnav_container

The Dockerfile sets up the complete environment, including:

  • CUDA and cuDNN for GPU support
  • Conda for environment management
  • Habitat-Sim and Habitat-Lab for simulation tasks
  • Essential Python libraries: PyTorch, torchvision, torchaudio

Ensure the entry script entrypoint.sh is executable.


Pretrained Checkpoints

We provide multiple trained configurations. We provide checkpoints for the SemNav 40 dataset in three setups:

  • RGBS (IL): Trained using imitation learning (IL) with RGB and semantic segmentation inputs.
  • RGBS (IL+RL): Trained using a combination of imitation learning (IL) and reinforcement learning (RL) with RGB and semantic segmentation inputs.
  • OS: Trained using only semantic segmentation inputs.
Configuration Description Download Link
RGBS (IL) RGB + Semantic Segmentation (IL) Download
RGBS (IL+RL) RGB + Semantic Segmentation (IL + RL) Download
OS Only Semantic Segmentation Download

Training

To train a model from scratch, run (with the conda environment activated):

bash scripts/launch_training.sh

The training dataset is available in the PirlNav repository.

Modify the training configuration in:

configs/experiments/il_objectnav.yaml

Policy Options

  • SEMANTIC_ObjectNavILMAEPolicy: Uses only semantic segmentation.
  • SEMANTIC_RGB_ObjectNavILMAEPolicy: Uses both semantic segmentation and RGB.
  • RGB_ObjectNavILMAEPolicy: Uses only RGB.

Pretrained visual encoder weights can be downloaded from the PirlNav repository.

Our Imitation Learning training parameters

Parameter Value
Number of GPUs 8
Number of environments per GPU 16
Rollout length 64
Number of mini-batches per epoch 2
Optimizer Adam
Learning rate scheduler Cyclic LR (exp_range)
Base learning rate 1×10⁻⁵
Maximum learning rate 1×10⁻³
Step size up 2000
Exponential decay factor (γ) 0.99994
DDPIL sync fraction 0.6

Evaluation

Run the evaluation with (with the conda environment activated):

bash scripts/launch_eval.sh

To evaluate pretrained models, select a checkpoint from pretrained_ckpt.


For further information, refer to our paper or visit our Group Page.

About

SemNav

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •