BLIP3o-NEXT

AR + Diffusion Architecture: Similar with BLIP3o, BLIP3o-NEXT generates intermediate features via the autoregressive model and then conditions on these features to generate images through the diffusion model.

Discrete Image Token Supervision: We add discrete SigLIP-2 image token prediction as extra training supervision, jointly optimizing CrossEntropy and the diffusion objective. By having the AR model lay down a discrete "blueprint" and feeding their hidden representations into the diffusion model, we combine structural accuracy with high visual-fidelity image outputs.

RL with verified reward: The introduction of discrete image tokens unlocks seamless compatibility with existing language-model RL framework. Using Group Relative Policy Optimization (GRPO), we train the BLIP3o-NEXT to improve prompt alignment and text rendering in image generation.

Fully Open-Source:

Pretraining Data: 27 Million Detailed Captions, 5 Million Short Captions
Instruction Tuning Data: BLIP3o-60k, ShareGPT-4o-Image
Model Weights (3B): Pretrain, Instruction Tuning, GRPO-Geneval, GRPO-Text
Training Code: Pretrain, Instruction Tuning, GRPO

🔥 Welcome to discuss with us if you have any questions. Discord: https://discord.gg/SsVYdV84bw or Wechat

Install package for pretraining and instruction tuning

conda create -n blip3o-next python=3.11 -y
conda activate blip3o-next
pip install --upgrade pip  setuptools
pip install -r requirements.txt
pip install -e .

Import slurm config and environment

sbatch  scrips/run.sh

For the inference, change the model path in inference.py and

python inference.py

For GRPO, we recommend to install a new enviroment since some version conflicts for torch if using blip3o-next environment. Also you need to install the dependency from setup.py, please follow below

cd trl
conda create -n grpo python=3.11 -y
conda activate grpo
pip install -r requirements.txt
cd ..
pip install -e .

Name		Name	Last commit message	Last commit date
Latest commit History 298 Commits
blip3o		blip3o
figure		figure
gradio		gradio
scripts		scripts
tok		tok
trl		trl
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BLIP3o-NEXT

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

JiuhaiChen/BLIP3o

Folders and files

Latest commit

History

Repository files navigation

BLIP3o-NEXT

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages