Skip to content

Understand the transformer architecture by learning about decoders with detailed explanations on the architecture

License

Notifications You must be signed in to change notification settings

malerbe/Decoders_Explained

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decoders Explained

Python PyTorch

Decoder diagram

Understand the transformer architecture by learning about decoders with detailed explanations on the architecture and a mini-project

How to use

This repository is not made to use the code inside in itself, but as a summary of differents classes and papers you can find on the internet. It is a complete guide to understand the basics, but in details, of how decoders within the Transformer architecture work and how they can be used as a standalone architecture for certain tasks.

You will find:

  1. A explanations.ipynb notebook in which you will find all the information about decoders and their code implementation.

  2. A model.py file in which you will find the whole implemention in a single file.

If used alone to learn about decoders, I recommand to first check my other repository [Encoders_Explained]

References

Original Paper

  • Vaswani, A., et al. (2017). "Attention Is All You Need". arXiv:1706.03762. [Paper]

Video Resources

  • Hugging Face. (2022). "Transformer: decodeur". [YouTube]
  • Machine Learning Studio. "A Dive Into Multihead Attention, Self-Attention and Cross-Attention". [YouTube]
  • Machine Learning Studio. "Self-Attention Using Scaled Dot-Product Approach". [YouTube]

Found this helpful?

If this repository helped you understand decoders, consider giving it a star !

About

Understand the transformer architecture by learning about decoders with detailed explanations on the architecture

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published