DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

🌟 Overview

A state-of-the-art image-to-3D model like TRELLIS often fails to reconstruct 3D objects that can stand under gravity, even when prompted with images of stable objects.
Our method, DSO, improves the image-to-3D model via Direct Simulation Optimization, significantly increasing the likelihood that generated 3D objects can stand, both in physical simulation and in real life when 3D printed.
The method incurs no additional cost at test time and thus generate such objects in seconds.

📦 Installation

Install the TRELLIS dependencies:

conda create -n dso python=3.10
conda activate dso
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124
. ./setup.sh --basic --xformers --flash-attn --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast
pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.4.0_cu124.html

If you run into issues, please consult the installation guide.

Install the remaining dependencies:

pip install -r requirements.txt

🤖 Pretrained Models

We provide two checkpoints, one trained with the direct preference optimization (DPO) and the other trained with our introduced direct reward optimization (DRO).

You can download them here:

git lfs install
git clone https://huggingface.co/rayli/DSO-finetuned-TRELLIS

Quick Start

You can generate physically-sound 3D assets conditioned on a single image using DSO-finetuned models with example.py:

from peft import LoraConfig, get_peft_model
from PIL import Image
from safetensors.torch import load_file

from trellis.pipelines import TrellisImageTo3DPipeline
from trellis.utils import postprocessing_utils

ckpt_path = "./DSO-finetuned-TRELLIS/dro-4000iters.safetensors"  # "./DSO-finetuned-TRELLIS/dpo-8000iters.safetensors"
image_path = "./image-to-3D-eval-stability-under-gravity/clock-eval/01.jpg"  # Or your own image

pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
pipeline.cuda()

peft_config = LoraConfig(
    r=64,
    lora_alpha=128,
    lora_dropout=0.0,
    target_modules=["to_q", "to_kv", "to_out", "to_qkv"]
)
pipeline.models["sparse_structure_flow_model"] = get_peft_model(pipeline.models["sparse_structure_flow_model"], peft_config)
pipeline.models["sparse_structure_flow_model"].load_state_dict(load_file(ckpt_path))

image = Image.open(image_path)
image = pipeline.preprocess_image(image)

outputs = pipeline.run(
    image,
    seed=0,
    preprocess_image=False,
    formats=["gaussian", "mesh"],
    sparse_structure_sampler_params={
        "steps": 12,
        "cfg_strength": 7.5,
    },
    slat_sampler_params={
        "steps": 12,
        "cfg_strength": 3,
    },
)[0]

glb = postprocessing_utils.to_glb(
    outputs['gaussian'][0],
    outputs['mesh'][0],
    simplify=0.95,          # Ratio of triangles to remove in the simplification process
    texture_size=1024,      # Size of the texture used for the GLB
    with_texture=True,      # Disable texture for faster stability evaluation
)
glb.export("./output.glb")

📊 Evaluation

Download the evaluation dataset:

git lfs install
git clone https://huggingface.co/datasets/rayli/image-to-3D-eval-stability-under-gravity

Generate 3D models using trained checkpoint:

python finetune.py --eval --config configs/eval-objaverse.yaml
python finetune.py --eval --config configs/eval-real.yaml

Compute stability and geometry metrics:

python evaluation/evaluate_stability.py --mesh_paths "./samples/objaverse_samples/*.glb"
python evaluation/evaluate_stability.py --mesh_paths "./samples/motorcycle_samples/*.glb"
python evaluation/evaluate_geometry.py --mesh_paths "./samples/objaverse_samples/*.glb" --ground_truth_paths "/path/to/objaverse/hf-objaverse-v1/glbs"

⚙️ Training

Generate synthetic training data.

First, generate / obtain a set of images saved with this structure:

📁 category/
|--- 📁 prompt1/
|    |--- image1.png
|    |--- image2.png
|    |--- ...
|--- 📁 prompt2/
|    |--- image1.png
|    |--- image2.png
|    |--- ...
|--- ...

Then, generate the 3D models conditioned on these images:

python -m data_preprocessing.generate_synthetic_data --num_samples 4 --save_extra --output_dir /path/to/data/root --image_paths "/path/to/the/above/dir/*/*/*.png"

Finally, augment the 3D models with simulation feedback and save their physical soundness score:

python -m data_preprocessing.augment_with_simulation_feedback --root_dir /path/to/data/root --num_models_per_image 4

Launch the training job:

# This runs on 4 NVIDIA A100s 80GB GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --multi_gpu --num_processes 4 --mixed_precision bf16 finetune.py --config configs/dpo.yaml

Note you need to specify the data directory in the configuration file.

Citation

@article{li2025dso,
    title   = {DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness},
    author  = {Li, Ruining and Zheng, Chuanxia and Rupprecht, Christian and Vedaldi, Andrea},
    journal = {arXiv preprint arXiv:2503.22677},
    year    = {2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

🌟 Overview

📦 Installation

🤖 Pretrained Models

Quick Start

📊 Evaluation

⚙️ Training

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
configs		configs
data_preprocessing		data_preprocessing
evaluation		evaluation
extensions/vox2seq		extensions/vox2seq
objaverse-eval		objaverse-eval
trellis		trellis
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
dataset.py		dataset.py
example.py		example.py
finetune.py		finetune.py
requirements.txt		requirements.txt
setup.sh		setup.sh

RuiningLi/dso

Folders and files

Latest commit

History

Repository files navigation

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

🌟 Overview

📦 Installation

🤖 Pretrained Models

Quick Start

📊 Evaluation

⚙️ Training

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages