Skip to content

james-zhou1/PapersReimplementations

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Personal re-implementations of known Machine Learning architectures, layers, algorithms and more. Re-implementations might be simplified and inefficient. The goal is learning / familiarizing / practicing with the core concepts without digging into too many details.

Current re-implementations include:

Reimplementations

DDPM - Denoise Diffusion Probabilistic Models

Implementation of the "Denoising Diffusion Probabilistic Models" paper. I use MNIST and FashionMNIST dataset as toy examples. The model used is a custom U-Net like architecture with the use of positional embeddings. Pre-trained models for both datasets (20 epochs only) are provided in the when using Git Large File System. Check out the Blog for a step-by-step explanation.

GNNS - Graph Neural Networks

Implementation of Convolutional and Attentional Graph Neural Networks (GNNs) taking inspiration from Petar Veličković "Everything is Connected: Graph Neural Networks" survey. Graph neural networks implemented differ in the way messages are passed.

I use these GNNs for the task of image classification (Graph property prediction) converting the MNIST digits into connected graphs (each pixel is a node, and it is connected to its neighbouring pixels in space).

⚠️ The implementation is inefficient for sparsely connected graph, as all possible connections are considered before being masked by the adjecency matrix (squared complexity in the number of nodes). The goal is to simply familiarize with GNNs.

From this implementation from scratch it is noticeable how the convolutional GNN converges faster than the more powerful attentional GNN for the toy dataset which favours the inductive bias provided by the convolution.

GPT - Generative Pre-trained Transformer

Decoder-only implementation of a GPT model from "Attention is all you need" paper. I simply train the transformer on 1984, the novel by George Orwell, based on next-character prediction. Samples generated by the model are stored into a file.

Samples obtained out of a small transformer (depth 6, width 384, 6 heads) can be found under /gpt/generated.txt. Here are a few:

################ SAMPLE 6 ################
Winston's heart brostless, then she got up with
a trays that dark was governed upon. They were little because of what they
could give him a day afraid of the Ninth Three-Year Plenty went out. The
doors had looked into his arms. On the evenings he murmuries

################ SAMPLE 16 ################
g.

'But this iden't boy,' he said, lave he said. 'The Party, didn't mean
it got into the barmty of Newspeak.'--and then he safer anything victim round
anything, as I'm reading to be anything but. They can't be take they can't
shoe, an year ago:

'Ah, five

################ SAMPLE 18 ################
Gothern's earlier stations and elusions
against planned the steps. The next purpose were interested. The same of
the tongue of the Revolution is were reality. The programmans that he had
stopped involving the Spies intercounted as phrase.

NF - Normalizing Flows

Implementation of the "Density estimation using Real NVP" paper. I re-implement and use 30 Affine Coupling layers to create a normalizing flow that can generate MNIST digits. The generated digits come with associated log probabilities, which tell which images are the most likely according to the model.

PPO - Proximal Policy Optimization

Implementation of the famous "Proximal Policy Optimization Algorithms" paper. I implement the simple PPO algorithm from scratch in pytorch using weights & biases for logging the loss terms and the average reward through iterations.

ViT - Vision Transformers

Implementation of the "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" paper. The MNIST dataset is used as a toy example for classification task. Blog .

License

The code is released with the MIT license.

About

Personal short implementations of Machine Learning papers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Python 0.4%