Skip to content
/ pvsm Public

Official code release for the PVSM paper: "From Rays to Projections: Better Inputs for Feed-Forward View Synthesis"

License

Notifications You must be signed in to change notification settings

wuzirui/pvsm

Repository files navigation

Installation

conda create -n pvsm python=3.11
conda activate pvsm
# Install torch torchvision based on your environment configurations
pip install -r requirements.txt

There's a known issue of the current release of gsplat==1.5.3, so please install gsplat via source for now:

# Install gsplat from source
pip install git+https://github.com/nerfstudio-project/gsplat.git

Quick Start

Download Checkpoints

Download DINOv3-ViT-B and place it under metric_checkpoints/;

Download our pre-trained model checkpoints:

  1. 12-layer model (small): OneDrive
  2. 24-layer model (full): OneDrive

After downloading, organize your checkpoints directory as follows:

metric_checkpoints/
├── pvsm_finetuned_full.pt          # Our trained full 24-layer model
├── pvsm_finetuned_small.pt         # Our trained smaller 12-layer model
├── dinov3-vitb16-pretrain-lvd1689m # DINOv3 Checkpoint
│   ├── config.json
│   ├── LICENSE.md
│   ├── model.safetensors
│   ├── preprocessor_config.json
│   └── README.md
├── imagenet-vgg-verydeep-19.mat    # (Optional) for training
└── map-anything                    # (Optional) for dataset generation
    ├── config.json
    ├── model.safetensors
    └── README.md

Interactive Demo

For a quick interactive demo, please follow the instruction and unzip the downloaded example data (22.3 MB) to your local machine.

To launch the interactive web-based demo:

torchrun --nproc_per_node 1 --standalone viser_demo.py --config-name runs/pvsm_finetuned_small

The demo will start a web server. Open your browser and navigate to the displayed URL to interact with the model.

System Requirements:

  • Small model: ~2.5GB VRAM
  • Full model: ~3.0GB VRAM

Note: The rendering quality in gsplat is compressed.

Running Inference

To run inference on a dataset:

python inference.py --config-name runs/pvsm_finetuned_small

Or for the full model:

python inference.py --config-name runs/pvsm_finetuned_full

Training

To train the model:

torchrun --nproc_per_node <num_gpus> train.py --config-name runs/pvsm_finetuned_small

Configuration:

  • Training configurations are located in configs/runs/
  • Model configurations are in configs/model/
  • Dataset configurations are in configs/dataset/

API Keys: Before training, create configs/api_keys.yaml with your WandB API key:

wandb: YOUR_WANDB_KEY

You can use configs/api_keys_example.yaml as a template.

Citation

If you find this work useful in your research, please consider citing:

@article{wu_pvsm_2026,
  title={From Rays to Projections: Better Inputs for Feed-Forward View Synthesis},
  author={Wu, Zirui and Jiang, Zeren and Oswald, Martin R. and Song, Jie},
  journal={arxiv preprint arxiv:2601.05116},
  year={2026}
}

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. See LICENSE.md for details.

Acknowledgement

This work is built upon LVSM's code base.

About

Official code release for the PVSM paper: "From Rays to Projections: Better Inputs for Feed-Forward View Synthesis"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages