This repository is a from-scratch reimplementation of ResNet-style convolutional networks, inspired by the original "Deep Residual Learning for Image Recognition" (He et al., 2015), with modern training practices layered on top.
The intent is not to chase benchmarks or novelty, but to rebuild the full training stack deliberately—architecture, optimization, scheduling, and data handling—to develop first-principles intuition for why these systems work.
Think of this as re-deriving ResNet in 2025, not copy-pasting it.
This is:
- A clean, readable ResNet implementation without
torchvision.models - A practical study of modern CNN training dynamics
- A reproducible, inspectable training pipeline
- A foundation for experimentation and extensions
This is not:
- A benchmark leaderboard submission
- A hyperparameter sweep or AutoML setup
- A research paper claiming architectural novelty
- Residual blocks implemented directly from the ResNet paper
- Configurable depth via a single parameter (
n) - Explicit skip connections (no hidden abstractions)
- CIFAR-10 (10-class image classification dataset)
- 50,000 training images, 10,000 test images
- 32×32 RGB images
- Used deliberately for fast iteration while still exposing real training dynamics
- Optimizer: AdamW (decoupled weight decay)
- Learning Rate Schedule: OneCycleLR
- Loss: Cross-Entropy
- Precision: FP32 with high matmul precision
- Compilation:
torch.compile(when supported)
- Explicit train/validation split
- Dedicated data augmentation module
- Dataset logic separated from training logic
Each choice is made to be visible and auditable, not hidden behind convenience APIs.
.
├── train.py # Training loop + optimization logic
├── test.py
├── layers.py
├── model.py # ResNet implementation (from scratch)
├── utils.py # CIFAR-10 dataset + data augmentation
├── artifacts/ # Generated after training
│ ├── loss_curve.png
│ ├── val_accuracy.png
│ └── lr_schedule.png
├── best_resnet.pth # Best model checkpoint (saved during training)
└── requirements.txt
Running training automatically generates a small set of high-signal artifacts:
- Training Loss vs Epoch — convergence behavior
- Validation Accuracy vs Epoch — generalization trend
- OneCycleLR Schedule — learning rate dynamics
These plots are intentionally minimal and designed for:
- fast iteration
- sanity checking
- Final Test Accuracy: 92.95% on CIFAR-10
- Training Time: ~3 hours
- Hardware: NVIDIA RTX 4090
The model was trained after loading the entire CIFAR-10 dataset into memory to minimize data-loading overhead and maximize GPU utilization.
Data augmentation was carefully tuned and validated during training to improve generalization without introducing instability.
pip install -r requirements.txtpython train.pyArtifacts will be saved under artifacts/ after training completes.
Many modern deep learning workflows hide critical decisions behind high-level abstractions. This project intentionally avoids that.
The focus is on:
- understanding why residual connections stabilize deep networks
- observing the impact of learning rate schedules vs architecture depth
- building intuition for optimization-driven performance gains
CIFAR-10 is used deliberately: it is small enough for fast iteration, yet rich enough to expose real training dynamics.
- Deeper or wider residual stacks
- Ablations: constant LR vs OneCycleLR
- Error analysis via sample predictions
- Comparison against
torchvisionbaselines
This implementation is based on the ideas introduced in:
He et al., 2015 — Deep Residual Learning for Image Recognition
All mistakes and design tradeoffs in this codebase are my own.
MIT