This repository contains the code for our ICLR 2025 paper Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models
.
We test our codebase with PyTorch 2.0.1. Please install corresponding PyTorch and CUDA versions according to your computational resources.
conda create -n DeGF python=3.10
conda activate DeGF
git clone https://github.com/zhangce01/DeGF.git
cd DeGF
pip install -r requirements.txt
Please also download the model checkpoints:
- LLaVA-1.5: Download LLaVA-1.5 merged 7B
- InstructBLIP: Download InstructBLIP
As for the datasets and benchmarks:
We provide the code for evaluating our DeGF on POPE, CHAIR, and MME-Hallucination benchmark. You can simply run the following code to run the experiments:
- POPE:
bash eval_bench/scripts/pope_eval.sh
- CHAIR:
bash eval_bench/scripts/chair_eval.sh
- MME:
bash experiments/cd_scripts/mme_eval.sh
Our codebase is adapted from RITUAL, VCD, OPERA, and LLaVA. We thank the authors for releasing their code!
If you have any questions, please contact at [email protected].
If you find this code useful, please consider citing our work:
@inproceedings{zhang2025selfcorrecting,
title={Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models},
author={Ce Zhang and Zifu Wan and Zhehan Kan and Martin Q. Ma and Simon Stepputtis and Deva Ramanan and Russ Salakhutdinov and Louis-Philippe Morency and Katia P. Sycara and Yaqi Xie},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=tTBXePRKSx}
}