Synthetic VQA data generation code for SpatialReasoner from the following paper:
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning.
Wufei Ma, Yu-Cheng Chou, Qihao Liu, Xingrui Wang, Celso de Melo, Jianwen Xie, and Alan Yuille
Johns Hopkins University
[arXiv] [Project Page]
Please check INSTALL.md for installation instructions. See Troubleshooting for known issues.
- Release visualization code.
- Visualize step-by-step generation results.
This project is released under the CC-BY-4.0 license. Please see the LICENSE file for more information.
If you find this repository helpful, please consider citing:
@article{ma2025spatialreasoner,
title={SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning},
author={Ma, Wufei and Chou, Yu-Cheng and Liu, Qihao and Wang, Xingrui and de Melo, Celso and Xie, Jianwen and Yuille, Alan},
journal={arXiv preprint arXiv:2504.20024},
year={2025}
}
@inproceedings{ma2025spatialllm,
title={Spatialllm: A compound 3d-informed design towards spatially-intelligent large multimodal models},
author={Ma, Wufei and Ye, Luoxin and de Melo, Celso M and Yuille, Alan and Chen, Jieneng},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={17249--17260},
year={2025}
}