SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Shenggan Cheng¹, Yuanxin Wei², Lansong Diao³, Yong Liu¹, Bujiao Chen³, Lianghua Huang³,
Yu Liu³, Wenyuan Yu³, Jiangsu Du², Wei Lin³^†, Yang You¹^†

¹National University of Singapore, ²Sun Yat-sen University, ³Alibaba Group

(† Corresponding authors.)

Introduction

SRDiffusion is a novel video diffusion inference framework that reduces computation costs through Sketching-Rendering Cooperation between large and small models:

The large model handles high-noise steps, preserving semantic and motion fidelity (Sketching).
The small model processes low-noise steps to refine visual details (Rendering).

SRDiffusion achieves up to 3× speedup on Wan and 2× speedup on CogVideoX, with minimal quality degradation. And it is complementary to other optimization techniques.

SRDiffusion for Wan 2.1

Follow the Wan2.1 setup guide to set up the environment and download the 14B and 1.3B models.

Run the following command to generate a video:

python ./wan/srd_generate.py --task t2v-14B --size 832*480 \
    --ckpt_dir ./Wan2.1-T2V-14B --rendering_ckpt_dir ./Wan2.1-T2V-1.3B \
    --offload_model True --t5_cpu \
    --self_diff_threshold 0.01 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

SRDiffusion for CogVideoX

Follow the CogVideo setup guide to prepare the environment.

Run the following command to generate a video:

python cli_demo_srd.py --seed 42 --num_frames 49 --fps 8 \
    --self_diff_threshold 0.01 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

Acknowledgement

This repository is built upon Wan2.1 and CogVideoX. We sincerely thank the contributors of these projects for their excellent work.

Contributing

For contribution guidelines, please refer to CONTRIBUTING.md.

Contributors

SRDiffusion is developed by Alibaba Group and NUS HPC-AI Lab. This work is supported by Alibaba Innovative Research(AIR).

License

SRDiffusion is licensed under the Apache License (Version 2.0). See the LICENSE file for more details. This project also includes third-party test cases released under other open-source licenses. Please refer to the NOTICE file for more information.

Citation

If you find SRDiffusion useful for you, please consider starring the project ⭐ and citing it using the following BibTeX entry:

@misc{cheng2025srdiffusion,
      title={SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation}, 
      author={Shenggan Cheng and Yuanxin Wei and Lansong Diao and Yong Liu and Bujiao Chen and Lianghua Huang and Yu Liu and Wenyuan Yu and Jiangsu Du and Wei Lin and Yang You},
      year={2025},
      eprint={2505.19151},
      archivePrefix={arXiv},
      primaryClass={cs.GR},
      url={https://arxiv.org/abs/2505.19151}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
cogvideo		cogvideo
wan		wan
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Introduction

SRDiffusion for Wan 2.1

SRDiffusion for CogVideoX

Acknowledgement

Contributing

Contributors

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

alibaba/SRDiffusion

Folders and files

Latest commit

History

Repository files navigation

SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Introduction

SRDiffusion for Wan 2.1

SRDiffusion for CogVideoX

Acknowledgement

Contributing

Contributors

License

Citation

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages