A collection of my machine learning mini-projects exploring deep learning and data science methods, including:
| Project | Folder | Link |
|---|---|---|
| 1. Autoencoders for dimensionality reduction and data imputation | autoencoder_scRNA-seq | 🔗 Link |
| 2. Convolutional Neural Networks (CNNs) and transfer learning for image classification tasks based on chest X-rays | CNN_and_TransferLearning_Xray | 🔗 Link |
| 3. Survival analysis with clinical and gene expression data | survival_analysis_multiple_myeloma | 🔗 Link |
| 4. Minimal federated learning workflow to compute a weighted mean of the per-site gene variances to identify the most variable genes² | federated_learning_minimal | 🔗 Link |
| 5. Variational autoencoder (VAE) approach to mitigate batch effects in scRNA-seq using federated learning simulations² | federated_learning_scRNA-seq | 🔗 Link |
| 6. minimal Retrieval-Augmented Generation (RAG): sentence-transformers + Google's FLAN-T5 and also applied in bioinformatics¹ | RAG_minimal | 🔗 Link |
| 7. Bayesian State Space Model¹ | SSM_minimal | 🔗 Link |
| 8. Variational autoencoder (VAE), minimal BERT language model/transformer, semi-supervised NMF and regression-based methods (lasso, ridge regression, elastic net) for the cell type deconvolution | VAE_NMF_Transformer_regression_cfDNA | 🔗 Link |
| 9. LLM-powered SPARQL Bioinformatics Assistant - uses a language model to turn biology questions into SPARQL queries, run them on UniProt/OMA/Bgee, and explain the results¹ | llm-biodata | 🔗 Link |
| 10. GNNs for spatial transcriptomics | GNN_spatialomics | 🔗 Link |
| 12. Introduction to bayesian A/B Testing with a beta-binomial model (PyMC)³ | Bayesian_inference_ABtesting_PyMC | 🔗 Link |
¹Implemented as part of the workshops at the PyData conference 2025 in Berlin
²Implemented as the result of the Swiss Institute of Bioinformatics (SIB) workshop Federated Learning in Bioinformatics
³Implemented as part of the workshops at the PyData Global conference 2025