PixNerd: Pixel Neural Field Diffusion

Introduction

We propose PixNerd, a powerful and efficient pixel-space diffusion transformer for image generation (without VAE). Different from conventional pixel diffusion models, we employ the neural field to improve the high frequercy modeling .

We achieve 1.93 FID on ImageNet256x256 Benchmark with PixNerd-XL/16 (1600k training steps).
We achieve 2.84 FID on ImageNet512x512 Benchmark with PixNerd-XL/16.
We achieve 0.73 overall score on GenEval Benchmark with PixNerd-XXL/16.
We achieve 80.9 avergae score on DPG Benchmark with PixNerd-XXL/16.

Visualizations

Checkpoints

Dataset	Model	Params	FID	HuggingFace
ImageNet256	PixNerd-XL/16	700M	1.93	🤗
ImageNet512	PixNerd-XL/16	700M	2.84	🤗

Dataset	Model	Params	GenEval	DPG	HuggingFace
Text-to-Image	PixNerd-XXL/16	1.2B	0.73	80.9	🤗

Online Demos

We provide online demos for PixNerd-XXL/16(text-to-image) on HuggingFace Spaces.

强烈建议本地部署玩玩，线上的模型推理速度会慢一些。以及因为这个我把任意分辨率和动画都关了。

HF spaces: https://huggingface.co/spaces/MCG-NJU/PixNerd

To host the local gradio demo, run the following command:

# for text-to-image applications
python app.py --config configs_t2i/inference_heavydecoder.yaml  --ckpt_path=XXX.ckpt

Usages

For C2i(ImageNet), We use ADM evaluation suite to report FID.

# for installation
pip install -r requirements.txt

# for inference
python main.py predict -c configs_c2i/pix256std1_repa_pixnerd_xl.yaml --ckpt_path=XXX.ckpt
# # or specify the GPU(s) to use with as :
CUDA_VISIBLE_DEVICES=0,1, python main.py predict -c configs_c2i/pix256std1_repa_pixnerd_xl.yaml --ckpt_path=XXX.ckpt

# for training
# train
python main.py fit -c configs_c2i/pix256std1_repa_pixnerd_xl.yaml

For T2i, we use GenEval and DPG to collect metrics.

Reference

@article{2507.23268,
Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang},
Title = {PixNerd: Pixel Neural Field Diffusion},
Year = {2025},
Eprint = {arXiv:2507.23268},
}

Acknowledgement

The code is mainly built upon FlowDCN and DDT.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.idea		.idea
configs_c2i		configs_c2i
configs_t2i		configs_t2i
evaluations		evaluations
figs		figs
src		src
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PixNerd: Pixel Neural Field Diffusion

Introduction

Visualizations

Checkpoints

Online Demos

Usages

Reference

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

MCG-NJU/PixNerd

Folders and files

Latest commit

History

Repository files navigation

PixNerd: Pixel Neural Field Diffusion

Introduction

Visualizations

Checkpoints

Online Demos

Usages

Reference

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages