Master's Degree in Data Science - University of Milano-Bicocca
A comprehensive collection of assignments, lecture notes, and projects from the Supervised Learning course
This repository contains all coursework, implementations, and research from the Supervised Learning course, including:
- Weekly Assignments: Hands-on exercises covering fundamental ML concepts
- Lecture Notes: Detailed notes and code examples from class sessions
- Final Project: TinyNet - A custom CNN for food classification with <1M parameters
A complete deep learning project tackling a challenging 251-class food image classification task with strict architectural constraints.
Key Achievements:
- β Custom CNN architecture with exactly 999,675 parameters (< 1M constraint)
- β 45.33% validation accuracy on 251 food categories
- β Self-supervised pre-training for improved convergence
- β Automated hyperparameter optimization with Optuna
Techniques Implemented:
- Convolutional Neural Networks (CNNs) with GELU activations
- Self-Supervised Learning (SSL) via image reconstruction
- Hyperparameter tuning with pruning strategies
- Advanced data augmentation preserving food characteristics
- Transfer learning from pre-trained encoder
π Project Location: Final_Project/
π Documentation:
- Comprehensive README - Setup, usage, and results
- Architecture Details - In-depth technical breakdown
- Project Report (PDF) - Full academic paper
The Assignments/ directory contains weekly exercises covering:
- Linear Regression - Least squares, regularization (Ridge, Lasso)
- Logistic Regression - Binary and multi-class classification
- Support Vector Machines - Kernel methods, margin optimization
- Decision Trees - CART, pruning, ensemble methods
- Neural Networks - Backpropagation, activation functions
- Deep Learning - CNNs, batch normalization, dropout
- Model Selection - Cross-validation, hyperparameter tuning
- Ensemble Methods - Bagging, boosting, random forests
- Dimensionality Reduction - PCA, feature selection
- Evaluation Metrics - Confusion matrix, ROC curves, F1-score
Each assignment includes:
- Problem statements
- Implementation in Python/PyTorch
- Analysis and results
- Visualizations
The Lessons_notes/ directory contains organized notes from each lecture:
Lessons_notes/
βββ L01/ - Introduction to Supervised Learning
βββ L02/ - Linear Models
βββ L03/ - Regularization Techniques
βββ L04/ - Classification Fundamentals
βββ L05/ - Support Vector Machines
βββ L06/ - Kernel Methods
βββ L07/ - Decision Trees
βββ L08/ - Ensemble Methods
βββ L09/ - Neural Networks Basics
βββ L10/ - Deep Learning
βββ L11/ - Convolutional Networks
βββ L12/ - Advanced CNN Architectures
βββ L13/ - Self-Supervised Learning
Notes include:
- Theoretical concepts with mathematical derivations
- Code implementations and examples
- Visualizations and diagrams
- References to key papers
- PyTorch: Deep learning framework for neural network implementation
- scikit-learn: Classical ML algorithms and utilities
- NumPy: Numerical computing
- Pandas: Data manipulation and analysis
- Matplotlib/Seaborn: Data visualization
- Optuna: Hyperparameter optimization framework
- TorchMetrics: Evaluation metrics for PyTorch
- OpenCV: Image processing for computer vision
- TensorBoard: Training visualization and monitoring
MSc_Supervised_Learning/
β
βββ Final_Project/ # Main project - TinyNet
β βββ README.md # Comprehensive documentation
β βββ ARCHITECTURE.md # Technical architecture details
β βββ main.py # Main training script
β βββ htuning.py # Hyperparameter tuning
β βββ pickles/ # Training metrics and results
β βββ Supervised_Learning__Final_project_.pdf
β
βββ Assignments/ # Weekly coursework
β βββ Assignment_01/
β βββ Assignment_02/
β βββ ...
β
βββ Lessons_notes/ # Lecture materials
β βββ L01/ through L13/
β βββ Additional resources
β
βββ .gitignore
βββ README.md # This file
Python 3.8+
CUDA-capable GPU (recommended for Final Project)- Clone the repository
git clone <repository-url>
cd MSc_Supervised_Learning- Set up virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
For the Final Project:
cd Final_Project
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install pandas numpy matplotlib seaborn pillow scikit-learn
pip install tqdm optuna torchsummary torchmetrics tensorboard opencv-pythonFor assignments:
pip install numpy pandas scikit-learn matplotlib seaborn jupytercd Final_Project
# Train TinyNet from scratch
python main.py
# Run hyperparameter optimization
python htuning.py
# View results in TensorBoard
tensorboard --logdir=runs/Detailed instructions available in Final_Project/README.md
| Metric | Value | Context |
|---|---|---|
| Validation Accuracy | 45.33% | 251 food categories |
| F1-Score (micro) | 0.4533 | Balanced performance |
| Model Parameters | 999,675 | < 1M constraint β |
| Training Time | ~3 hours | RTX 3080, 150 epochs |
| Configuration | Accuracy | Notes |
|---|---|---|
| TinyNet + SSL | 45.33% | Best overall |
| TinyNet Baseline | 45.31% | Strong baseline |
| Tuned + SSL | 43.93% | Faster convergence |
| Tuned Only | 43.83% | Different optimum |
Through this course and project, I developed expertise in:
- Classical ML: Strong foundation in traditional supervised learning algorithms
- Deep Learning: Hands-on experience with CNN architectures and training
- Model Optimization: Hyperparameter tuning, regularization, and convergence strategies
- Research Skills: Literature review, experimentation, and technical writing
- Software Engineering: Clean code, version control, and reproducible research
- Problem Solving: Working within constraints, debugging, and iterative improvement
- Lecture slides and notes (included in
Lessons_notes/) - Recommended textbooks:
- Pattern Recognition and Machine Learning - Bishop
- Deep Learning - Goodfellow, Bengio, Courville
- Hands-On Machine Learning - GΓ©ron
- Krizhevsky et al. (2012) - AlexNet
- Simonyan & Zisserman (2015) - VGG
- Ronneberger et al. (2015) - U-Net
- Hendrycks & Gimpel (2023) - GELU
- Akiba et al. (2019) - Optuna
See Final_Project/README.md for complete bibliography.
Student: Mirko Morello (920601), Andrea Borghesi (916202) Institution: University of Milano-Bicocca Program: MSc in Data Science Course: Supervised Learning Academic Year: 2024-2025
This repository contains academic coursework and is intended for educational purposes. Please respect academic integrity policies if referencing this work.
- Instructors: For comprehensive course materials and guidance
- Teaching Assistants: For support during assignments
- PyTorch Community: For excellent documentation and examples
- Optuna Team: For powerful hyperparameter optimization tools
β If you found this repository helpful, please consider giving it a star! β
For questions about the Final Project, see the project README