Skip to content

CVEO/INVITATION

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion

English | 简体中文

Official implementation of INVITATION, a novel framework for UAV image semantic segmentation through depth fusion, published in IEEE GRSL.

📖 Introduction

INVITATION exclusively takes original UAV imagery as input, yet is capable of obtaining complemented depth information and fusing into RGB semantic segmentation models effectively, thereby enhancing UAV semantic segmentation accuracy. Concretely, this framework supports two distinct depth generation approaches:

  • Multi-View Stereo (MVS): High-precision depth reconstruction from UAV video sequences or multiple-view UAV images
  • Monocular Depth Estimation: Depth prediction from single images via pretrained models using individual images

Key Results on UAVid Dataset:

Method mIoU (%) Improvement
Baseline (RGB) 66.02 -
+ MVS Depth 70.57 ↑ 4.55
+ Monocular Depth 69.69 ↑ 3.67

Framework Teaser

🧩 *Figure 1: Architecture of INVITATION Framework.*

Result Teaser

🧩 *Figure 2: Comparison of semantic segmentation results on UAVid dataset.*

🚀 Quick Start

Prerequisites

  • Python 3.7+
  • pytorch
  • gdal
  • numpy
  • opencv ...

Installation

  1. Clone repository:

    git clone https://github.com/CVEO/INVITATION.git
    cd INVITATION
  2. Dataset Preparation:
    Download UAVid Dataset and organize as:

    /data/
    └── uavid/
        ├── Depth/      # Depth maps (MVS or monocular depth estimation)
        ├── RGB/        # original UAV images
        └── Label/      # Segmentation labels
    
  3. Training:
    First, configure settings in config.py
    Then, start training the network using python train.py

📂 Repository Structure

INVITATION 
├── configs.py              # Training configurations
├── /dataloader/            # UAVid data loader and data augmentation
│   ├── dataloader.py   
│   └── UAVDataset.py     
├── /models/                # Model architectures
│   ├── attention.py        
│   ├── encoder_decoder.py         
│   └── builder.py        
├── /utils/                 # Utility scripts
│   ├── loss.py    
│   ├── visualize.py 
│   └── ... 
├── /outputs/               # Training logs & checkpoints
└── README.md               # Documentation

📜 Citation

If you like to use our work, please consider citing:

bibtex
@ARTICLE{10858079,
  author={Zhang, Xiaodong and Zhou, Wenlin and Chen, Guanzhou and Wang, Jiaqi and Yang, Qingyuan and Tan, Xiaoliang and Wang, Tong and Chen, Yifei},
  journal={IEEE Geoscience and Remote Sensing Letters}, 
  title={INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy through Depth Information Fusion}, 
  year={2025},
  volume={},
  number={},
  pages={1-1},
  keywords={Autonomous aerial vehicles;Semantic segmentation;Feature extraction;Training;Decoding;Accuracy;Depth measurement;Semantics;Data models;Vectors;Depth Information Fusion;Unmanned Aerial Vehicles (UAVs);Semantic Segmentation;Cross-modal Feature Enhancement;Vision Transformers (ViTs)},
  doi={10.1109/LGRS.2025.3534994}}

📄 License

This project is released under the Non-Commercial Academic License. For commercial use, please contact the authors.

🤝 Acknowledgements and Reference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages