Large Language Diffusion with Ordered Unmasking (LLaDOU)

We introduce the Large Language Diffusion with Ordered Unmasking (LLaDOU), which is trained by reinforcing a new reasoning paradigm named the Diffusion Chain of Lateral Thought (DCoLT) for diffusion language models.

Compared to standard CoT, DCoLT is distinguished with several notable features:

Bidirectional Reasoning: Allowing global refinement throughout generations with bidirectional self-attention masks.
Format-Free Reasoning: No strict rule on grammatical correctness amid its intermediate steps of thought.
Nonlinear Generation: Generating tokens at various positions in different steps.

News

[Sep 2025] LLaDOU has been accepted by NeurIPS 2025. Congrats!
[July 2025] Training code is provided!
[May 2025] Released LLaDOU v0 Math and LLaDOU v0 Code models, their evaluation code and technique report.

Getting Started

Inference

import torch
from transformers import AutoTokenizer
from networks.lladou_v0 import LLaDOUModelLM, sample

tokenizer = AutoTokenizer.from_pretrained("models/LLaDOU-v0-Math")
model = LLaDOUModelLM.from_pretrained(
    pretrained_model_name_or_path="models/LLaDOU-v0-Math",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

problem = "What is the answer of 1+1?"
outputs = sample(
    model,
    problem,
    tokenizer,
    device="cuda",
)
response = outputs["responses"][0]
print(response)

Training

We provide an example to train LLaDOU on GSM8K dataset, feel free to change the configuration file!

accelerate launch --num_processes 8 --config_file configs/accelerate/fsdp.yaml train.py --config configs/gsm8k_64step_example.yaml

Evaluation

Prepare datasets as following:

├── datasets
│   ├── gsm8k
│   │   └── ...
│   ├── MATH
│   │   └── ...
│   ├── mbpp.jsonl
│   ├── mbpp_test.jsonl
│   └── HumanEval.jsonl.gz

For GSM8K and MATH evaluation, please run scripts/eval_math.sh.
For MBPP and HumanEval evaluation, please run scripts/eval_code.sh.

Evaluation Metrics

Citation

If this repository helps with your work, please consider giving a star and citation:

@inproceedings{huang2025reinforcing,
  title={Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models},
  author={Zemin Huang and Zhiyang Chen and Zijun Wang and Tiancheng Li and Guo-Jun Qi},
  journal={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
configs		configs
custom_humaneval		custom_humaneval
dataloaders		dataloaders
datasets		datasets
dnnlib		dnnlib
evaluate		evaluate
networks		networks
scripts		scripts
torch_utils		torch_utils
training		training
.gitignore		.gitignore
README.md		README.md
code_metrics.py		code_metrics.py
math_metrics.py		math_metrics.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Large Language Diffusion with Ordered Unmasking (LLaDOU)

News

Getting Started

Inference

Training

Evaluation

Citation

About

Uh oh!

Languages

maple-research-lab/LLaDOU

Folders and files

Latest commit

History

Repository files navigation

Large Language Diffusion with Ordered Unmasking (LLaDOU)

News

Getting Started

Inference

Training

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages