🔥 MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes
If you want to test your VOS model's performance in real-world complex scenarios, MOSEv2 is the right choice. Here are some cases from MOSEv2.
- ⬇️ Download Dataset
- 🏠 Homepage
- 📄 MOSEv2 Paper (arXiv)
- 🏆 Evaluation Server
- 🤖 Baseline Model: SAM2RCMS
- ⬇️ Download Baseline Model
MOSEv1: A New Dataset for Video Object Segmentation in Complex Scenes
- [2025/08/07] MOSEv2 dataset has been released! 🔥🎉🚀✨🎊🌟💫🎈
- [2023/02/09] MOSEv1 dataset has been released!
- 🤗 Hugging Face
- ☁️ Baidu Pan (pwd: p2m6)
- ☁️ Google Drive
- ☁️ OneDrive
- 🤗 Hugging Face
- ☁️ OneDrive
- ☁️ Google Drive
- ☁️ Baidu Pan (pwd: MOSE)
The dataset follows a similar structure as DAVIS and Youtube-VOS. The dataset consists of two parts: JPEGImages
which holds the frame images, and Annotations
which contains the corresponding segmentation masks. The frame images are numbered using five-digit numbers. Annotations are saved in color-pattlate mode PNGs like DAVIS.
Please note that while annotations for all frames in the training set are provided, annotations for the validation set will only include the first frame.
<train/valid.tar>
│
├── Annotations
│ │
│ ├── <video_name_1>
│ │ ├── 00000.png
│ │ ├── 00001.png
│ │ └── ...
│ │
│ ├── <video_name_2>
│ │ ├── 00000.png
│ │ ├── 00001.png
│ │ └── ...
│ │
│ ├── <video_name_...>
│
└── JPEGImages
│
├── <video_name_1>
│ ├── 00000.jpg
│ ├── 00001.jpg
│ └── ...
│
├── <video_name_2>
│ ├── 00000.jpg
│ ├── 00001.jpg
│ └── ...
│
└── <video_name_...>
Please consider to cite MOSE if it helps your research.
@article{MOSEv2,
title={{MOSEv2}: A More Challenging Dataset for Video Object Segmentation in Complex Scenes},
author={Ding, Henghui and Ying, Kaining and Liu, Chang and He, Shuting and Jiang, Xudong and Jiang, Yu-Gang and Torr, Philip HS and Bai, Song},
journal={arXiv preprint arXiv:2508.05630},
year={2025}
}
@inproceedings{MOSE,
title={{MOSE}: A New Dataset for Video Object Segmentation in Complex Scenes},
author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Torr, Philip HS and Bai, Song},
booktitle={ICCV},
year={2023}
}
MOSE is licensed under a CC BY-NC-SA 4.0 License. The data of MOSE is released for non-commercial research purpose only.