Ranran Huang · Krystian Mikolajczyk
SPFSplatV2 efficiently leverages masked attention to predict target poses while simultaneously predicting 3D Gaussians from unposed sparse images, without requiring ground-truth poses during either training or inference.
Table of Contents
- Clone SPFSplat.
git clone git@github.com:ranrhuang/SPFSplatV2.git
cd SPFSplatV2- Create the environment, here we show an example using conda.
conda create -n spfsplatv2 python=3.11
conda activate spfpslatv2
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt- Optional, compile the cuda kernels for RoPE (as in CroCo v2).
cd src/model/encoder/backbone/croco/curope/
python setup.py build_ext --inplace
cd ../../../../../..Our models are hosted on Hugging Face 🤗
| Model name | Training resolutions | Training data | Training settings |
|---|---|---|---|
| re10k_spfsplatv2.ckpt | 256x256 | re10k | RE10K, 2 views, MASt3R-based |
| acid_spfsplatv2.ckpt | 256x256 | acid | ACID, 2 views, MASt3R-based |
| re10k_spfsplatv2l.ckpt | 256x256 | re10k | RE10K, 2 views, VGGT-based |
| acid_spfsplatv2l.ckpt | 256x256 | acid | ACID, 2 views, VGGT-based |
We assume the downloaded weights are located in the pretrained_weights directory.
Please refer to DATASETS.md for dataset preparation.
-
If using MASt3R-based architecture, download the MASt3R pretrained model and put it in the
./pretrained_weightsdirectory. -
Train with:
# 2 view on MASt3R-based architecture
python -m src.main +experiment=spfsplatv2/re10k wandb.mode=online wandb.name=re10k_spfsplatv2
# 2 view on VGGT-based architecture
python -m src.main +experiment=spfsplatv2-l/re10k wandb.mode=online wandb.name=re10k_spfsplatv2l
# RealEstate10K on MASt3R-based architecture(enable test.align_pose=true if using evaluation-time pose alignment)
python -m src.main +experiment=spfsplatv2/re10k mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
checkpointing.load=./pretrained_weights/re10k_spfsplatv2.ckpt \
test.save_image=true test.align_pose=false
# ACID on MASt3R-based architecture(enable test.align_pose=true if using evaluation-time pose alignment)
python -m src.main +experiment=spfsplatv2/acid mode=test wandb.name=acid \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \
checkpointing.load=./pretrained_weights/acid_spfsplatv2.ckpt \
test.save_image=false test.align_pose=false
# RealEstate10K on VGGT-based architecture(enable test.align_pose=true if using evaluation-time pose alignment)
python -m src.main +experiment=spfsplatv2-l/re10k mode=test wandb.name=re10k \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_re10k.json \
checkpointing.load=./pretrained_weights/re10k_spfsplatv2l.ckpt \
test.save_image=true test.align_pose=false
# ACID on VGGT-based architecture(enable test.align_pose=true if using evaluation-time pose alignment)
python -m src.main +experiment=spfsplatv2-l/acid mode=test wandb.name=acid \
dataset/view_sampler@dataset.re10k.view_sampler=evaluation \
dataset.re10k.view_sampler.index_path=assets/evaluation_index_acid.json \
checkpointing.load=./pretrained_weights/acid_spfsplatv2l.ckpt \
test.save_image=false test.align_pose=false
We follow the pixelSplat camera system. The camera intrinsic matrices are normalized (the first row is divided by image width, and the second row is divided by image height). The camera extrinsic matrices are OpenCV-style camera-to-world matrices ( +X right, +Y down, +Z camera looks into the screen).
This project is built upon these excellent repositories: NoPoSplat, pixelSplat, DUSt3R, and CroCo. We thank the original authors for their excellent work.
@article{huang2025spfsplatv2,
title={SPFSplatV2: Efficient Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views},
author={Huang, Ranran and Mikolajczyk, Krystian},
journal={arXiv preprint arXiv: 2509.17246},
year={2025}
}