A deep learning model that predicts camera matrices from 2D keypoints using a transformer-based architecture with graph convolutions.
- Graph-based keypoint processing
- Transformer architecture for sequence modeling
- Combined Frobenius norm and reconstruction loss
- TensorBoard visualization support
- MongoDB integration for data management
model/: Contains the model architecture and training logic.data/: Handles data loading and preprocessing.utils/: Utility functions for data visualization and logging.runs/: Stores model checkpoints and TensorBoard logs.weights/: Saved model weights.Readme.md: This file.
- Clone the repository:
git clone https://github.com/your-username/ktpformer.git
cd ktpformer- Install dependencies:
pip install -r requirements.txt- Set up your environment variables:
cp .env.example .env- Run the training script:
python train.py