INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion

Official implementation of INVITATION, a novel framework for UAV image semantic segmentation through depth fusion, published in IEEE GRSL.

📖 Introduction

INVITATION exclusively takes original UAV imagery as input, yet is capable of obtaining complemented depth information and fusing into RGB semantic segmentation models effectively, thereby enhancing UAV semantic segmentation accuracy. Concretely, this framework supports two distinct depth generation approaches:

Multi-View Stereo (MVS): High-precision depth reconstruction from UAV video sequences or multiple-view UAV images
Monocular Depth Estimation: Depth prediction from single images via pretrained models using individual images

Key Results on UAVid Dataset:

Method	mIoU (%)	Improvement
Baseline (RGB)	66.02	-
+ MVS Depth	70.57	↑ 4.55
+ Monocular Depth	69.69	↑ 3.67

🧩 *Figure 1: Architecture of INVITATION Framework.*

🧩 *Figure 2: Comparison of semantic segmentation results on UAVid dataset.*

🚀 Quick Start

Prerequisites

Python 3.7+
pytorch
gdal
numpy
opencv ...

Installation

Clone repository:

git clone https://github.com/CVEO/INVITATION.git
cd INVITATION

Dataset Preparation:
Download UAVid Dataset and organize as:

/data/
└── uavid/
    ├── Depth/      # Depth maps (MVS or monocular depth estimation)
    ├── RGB/        # original UAV images
    └── Label/      # Segmentation labels

Training:
First, configure settings in config.py
Then, start training the network using python train.py

📂 Repository Structure

INVITATION 
├── configs.py              # Training configurations
├── /dataloader/            # UAVid data loader and data augmentation
│   ├── dataloader.py   
│   └── UAVDataset.py     
├── /models/                # Model architectures
│   ├── attention.py        
│   ├── encoder_decoder.py         
│   └── builder.py        
├── /utils/                 # Utility scripts
│   ├── loss.py    
│   ├── visualize.py 
│   └── ... 
├── /outputs/               # Training logs & checkpoints
└── README.md               # Documentation

📜 Citation

If you like to use our work, please consider citing:

bibtex
@ARTICLE{10858079,
  author={Zhang, Xiaodong and Zhou, Wenlin and Chen, Guanzhou and Wang, Jiaqi and Yang, Qingyuan and Tan, Xiaoliang and Wang, Tong and Chen, Yifei},
  journal={IEEE Geoscience and Remote Sensing Letters}, 
  title={INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion}, 
  year={2025},
  volume={},
  number={},
  pages={1-1},
  keywords={Autonomous aerial vehicles;Semantic segmentation;Feature extraction;Training;Decoding;Accuracy;Depth measurement;Semantics;Data models;Vectors;Depth Information Fusion;Unmanned Aerial Vehicles (UAVs);Semantic Segmentation;Cross-modal Feature Enhancement;Vision Transformers (ViTs)},
  doi={10.1109/LGRS.2025.3534994}}

📄 License

This project is released under the Non-Commercial Academic License. For commercial use, please contact the authors.

🤝 Acknowledgements and Reference

UAVid Dataset: https://uavid.nl/
MVS Implementation: COLMAP (https://colmap.github.io/)
Monocular Depth Estimation:
- Monodepth2 (https://github.com/nianticlabs/monodepth2)
- ZeoDepth (https://github.com/isl-org/ZoeDepth)
- DepthAnything (https://github.com/LiheYoung/Depth-Anything)
Base Segmentation Code: https://github.com/huaaaliu/RGBX_Semantic_Segmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion

📖 Introduction

🚀 Quick Start

Prerequisites

Installation

📂 Repository Structure

📜 Citation

📄 License

🤝 Acknowledgements and Reference

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dataloader		dataloader
engine		engine
figures		figures
models		models
utils		utils
README-zh_CN.md		README-zh_CN.md
README.md		README.md
config.py		config.py
train.py		train.py

CVEO/INVITATION

Folders and files

Latest commit

History

Repository files navigation

INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion

📖 Introduction

🚀 Quick Start

Prerequisites

Installation

📂 Repository Structure

📜 Citation

📄 License

🤝 Acknowledgements and Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages