EPFL Machine Learning and Optimization Laboratory

All

62 repositories

disco
Public
DISCO is a code-free and installation-free browser platform that allows any non-technical user to collaboratively train machine learning models without sharing any private data.
machine-learning mobile browser collaborative-learning privacy-preserving federated-learning decentralized-learning
TypeScript
•
Apache License 2.0
•30•176•58•10•Updated Jan 3, 2026Jan 3, 2026
llm-optimizer-benchmark
Public
Benchmarking Optimizers for LLM Pretraining
benchmarking optimizers llm
Python
•
Apache License 2.0
•2•47•0•0•Updated Dec 30, 2025Dec 30, 2025
getting-started
Public
Python
•16•26•1•2•Updated Dec 17, 2025Dec 17, 2025
ML_course
Public
EPFL Machine Learning Course, Fall 2025
Jupyter Notebook
•1k•2k•4•1•Updated Dec 15, 2025Dec 15, 2025
grad-norm-smooth
Public
Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"
second-order-optimization inexact-hessians
Jupyter Notebook
•
Apache License 2.0
•0•0•0•0•Updated Nov 9, 2025Nov 9, 2025
llm-baselines
Public
nanoGPT-like codebase for LLM training
pretraining llms
Python
•
MIT License
•36•113•3•3•Updated Nov 7, 2025Nov 7, 2025
CoMiGS
Public
Python
•
Apache License 2.0
•0•1•0•0•Updated Sep 24, 2025Sep 24, 2025
TiMoE
Public
A time aware language modeling framework
Python
•0•1•0•0•Updated Aug 31, 2025Aug 31, 2025
OptML_course
Public
EPFL Course - Optimization for Machine Learning - CS-439
Jupyter Notebook
•336•1.4k•5•1•Updated Jul 8, 2025Jul 8, 2025
fineweb2-hq
Public
Code for the paper "Enhancing Multilingual LLM Pretraining with Model-Based Data Selection"
Python
•
Apache License 2.0
•2•3•0•0•Updated May 16, 2025May 16, 2025
schedules-and-scaling
Public
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
Python
•
MIT License
•8•86•1•0•Updated Oct 30, 2024Oct 30, 2024
powersgd
Public
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
Python
•
MIT License
•32•149•1•1•Updated Oct 29, 2024Oct 29, 2024
CoBo
Public
Python
•0•0•0•0•Updated Oct 22, 2024Oct 22, 2024
personalized-collaborative-llms
Public
Exploration on-device self-supervised collaborative fine-tuning of large language models with limited local data availability, using Low-Rank Adaptation (LoRA). We introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based.
Python
•
Apache License 2.0
•0•6•0•0•Updated Sep 2, 2024Sep 2, 2024
error-feedback-SGD
Public
SGD with compressed gradients and error-feedback: https://arxiv.org/abs/1901.09847
Jupyter Notebook
•
MIT License
•10•32•2•2•Updated Jul 25, 2024Jul 25, 2024
REQ
Public
Python
•
Apache License 2.0
•0•18•0•0•Updated Jun 10, 2024Jun 10, 2024
CoTFormer
Public
Python
•
MIT License
•0•6•0•0•Updated May 22, 2024May 22, 2024
semester-project-personalization
Public
Python
•0•0•0•0•Updated May 22, 2024May 22, 2024
getting-started-lauzhack
Public
Python
•16•0•0•0•Updated Apr 18, 2024Apr 18, 2024
DenseFormer
Public
Python
•
Apache License 2.0
•10•82•0•1•Updated Apr 16, 2024Apr 16, 2024
DoGE
Public
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
5•0•0•0•Updated Feb 4, 2024Feb 4, 2024
landmark-attention
Public
Landmark Attention: Random-Access Infinite Context Length for Transformers
Python
•
Apache License 2.0
•36•427•8•0•Updated Dec 20, 2023Dec 20, 2023
pam
Public
Python
•
Apache License 2.0
•4•16•0•0•Updated Dec 9, 2023Dec 9, 2023
ghost-noise
Public
Python
•
Apache License 2.0
•0•4•0•0•Updated Aug 18, 2023Aug 18, 2023
optML-pku
Public
summer school materials
6•46•0•0•Updated Aug 4, 2023Aug 4, 2023
collaborative-attention
Public
Code for Multi-Head Attention: Collaborate Instead of Concatenate
Python
•
Apache License 2.0
•22•153•6•1•Updated Jun 12, 2023Jun 12, 2023
dynamic-sparse-flash-attention
Public
Jupyter Notebook
•
Other
•6•150•2•0•Updated Jun 2, 2023Jun 2, 2023
easy-summary
Public
difficulty-guided text summarization
Python
•
Apache License 2.0
•6•5•0•1•Updated May 22, 2023May 22, 2023
relaysgd
Public
Code for the paper “RelaySum for Decentralized Deep Learning on Heterogeneous Data”
Jupyter Notebook
•
MIT License
•3•10•0•0•Updated Apr 21, 2023Apr 21, 2023
epfml-utils
Public
Tools for experimentation and using run:ai. The aim is for these to be small self-contained utilities that are used by multiple people.
Python
•
Apache License 2.0
•0•0•1•0•Updated Mar 16, 2023Mar 16, 2023