Skip to content

Official implementation of our paper: "Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing" (ICML 2025)

License

Notifications You must be signed in to change notification settings

Dawn-LX/CausalCache-VDM

Repository files navigation

[!New] Our paper: Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing is accepted by ICML 2025 arxiv link.

Our code is build upon open-sora (https://github.com/hpcaitech/Open-Sora), with the following features

  • autoregressive video generation, i.e., generating subsequent clips conditioned on last frames of previous clip

  • calsual generaion (by causal temporal attention)

  • cache sharing, the kv-cache is shared across all the denoising steps. This is differnet to the kv-cache implementation in live2diff

  • kv-cache queue, i.e., autoregressive generation without the redundant computation of overlapped conditional frames. the old kv-cache will be deququed

  • cyclic temporal positional embeddings (TPEs). i.e., we use cyclic shift to support the kv-cache queue

  • the key difference of our implementation compared to live2diff

    • our kv-cache is shared across all the denoising steps. They store the kv-cache for all the denoising steps
    • we use a cache queue structure to support the autoregressive generation, facilitated by the cyclic-TPEs

training script

an overfiting demo

bash scripts/train.sh \
    configs/causal_stdit/train_overfit_beach_demo.py \
    overfit_demo \
    9686 0

SkyTimelapse demo

bash scripts/train.sh \
    configs/causal_stdit/train_SkyTimelapse_demo.py \
    skytimelapse_demo \
    9686 0

refer to scripts/train.sh to config the ROOT_DATA_DIR

The code is preparing

Citation

If you find this code is helpful, please cite our paper:

@inproceedings{gao2025ca2,
  title={Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing},
  author={Gao, Kaifeng and Shi, Jiaxin and Zhang, Hanwang and Wang, Chunping and Xiao, Jun and Chen, Long},
  booktitle={ICML},
  year={2025},
  organization={PMLR}
}

About

Official implementation of our paper: "Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing" (ICML 2025)

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published