- [2025-04-26] The training code is now available.
- [2024-06-20] Our paper is accepted at TNNLS 2024.
- [2024-04-16] Initialize the repository, release the test code and the trained model.
- python == 3.8.15
- torch == 1.10.0
- torchvision == 0.11.0
- cuda == 11.4
- opencv == 4.6.0
Please download the following datasets:
UVOS datasets:
- YouTube-VOS: YouTube-VOS
- DAVIS: DAVIS
- YouTube-Objects: YouTube-Objects
- FBMS: FBMS
- LongVideos: LongVideos
VSOD datasets:
- DAVIS: same as UVOS.
- DAVSOD: DAVSOD
- SegTrack-V2: SegTrack-V2
- ViSal: ViSal
To quickly reproduce our results, we upload the processed data to Google Drive and Baidu Disk (code: qcbh).
stage | model link |
---|---|
pre-train | Google Drive, Baidu Disk (code: qcbh) |
fine-tuning | Google Drive, Baidu Disk (code: qcbh) |
To reproduct the results we reported in paper, please download the corresponding models and run test script.
Distributed Training.
sh train_m.sh
Single-GPU Training.
sh train_s.sh
Download the trained MTNet, and placing it in the ./saves
.
python test.py [test_model] [task_name] [test_dataset] [output_dir]
Testing for UVOS task:
python test.py --test_model ./saves/mtnet.pth --task_name UVOS --test_dataset DAVIS16 --output_dir output
Testing for VSOD task:
python test.py --test_model ./saves/mtnet.pth --task_name VSOD --test_dataset DAVIS16 --output_dir output
Baidu Disk (code: qcbh)
Evaluation for UVOS results:
python test_scripts/test_for_davis.py --gt_path ../data/DAVIS16/val/mask --result_path output/MTNet/UVOS/DAVIS16/
Evaluation for VSOD results:
python test_scripts/test_vsod/main.py --method MTNet --dataset DAVIS16 --gt_dir test_scripts/test_vsod/gt/ --pred_dir test_scripts/test_vsod/results/
Specify the dataset in visualize.py
, then run:
python visualize.py
This repository owes its existence to the exceptional contributions of other projects:
- STCN: https://github.com/hkchengrex/STCN
- AOT: https://github.com/yoxu515/aot-benchmark
- HFAN: https://github.com/NUST-Machine-Intelligence-Laboratory/HFAN
- FSNet: https://github.com/GewelsJI/FSNet
- AMCNet: https://github.com/isyangshu/AMC-Net
- DAVSOD: https://github.com/DengPingFan/DAVSOD
Many thanks to their invaluable contributions.
@article{zhuge2024learning,
title={Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation},
author={Zhuge, Yunzhi and Gu, Hongyu and Zhang, Lu and Qi, Jinqing and Lu, Huchuan},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2024},
publisher={IEEE}
}