Skip to content

Giovo17/tfs-mt

Repository files navigation

tfs-mt
Transformer from scratch for Machine Translation

Release Build status License: MIT pypi monthly downloads

▶️ Getting started📖 Documentation🤗 Hugging Face🎬 Demo


This project implements the Transformer architecture from scratch considering Machine Translation as the usecase. It's mainly intended as an educational resource and a functional implementation of the architecture and the training/inference logic.

Getting Started

From pip

pip install tfs-mt

From source

Prerequisites

Steps

git clone https://github.com/Giovo17/tfs-mt.git
cd tfs-mt

uv sync

cp .env.example .env
# Edit .env file with your configuration

Usage

Training

To start training the model with the default configuration:

uv run src/train.py

Inference

To run inference using the trained model from the HuggingFace repo:

uv run src/inference.py

Configuration

The whole project parameters can be configured in src/tfs_mt/configs/config.yml. Key configurations include:

  • Model Architecture: Config, dropout, GloVe embedding init, ...
  • Training: Optimizer, Learning rate scheduler, number of epochs, ...
  • Data: Dataset, Dataloader, Tokenizer, ...

Architecture

For a detailed explanation of the architecture and design choices, please refer to the Architecture Documentation.

Model Sizes

The project supports various model configurations to suit different computational resources:

Parameter Nano Small Base Original
Encoder Layers 4 6 8 6
Decoder Layers 4 6 8 6
d_model 50 100 300 512
Num Heads 4 6 8 8
d_ff 200 400 800 2048
Norm Type PostNorm PostNorm PostNorm PostNorm
Dropout 0.1 0.1 0.1 0.1
GloVe Dim 50d 100d 300d -

Documentation

Full documentation is available at https://giovo17.github.io/tfs-mt/.

Citation

If you use tfs-mt in your research or project, please cite:

@software{Spadaro_tfs-mt,
author = {Spadaro, Giovanni},
license = {MIT},
title = {{tfs-mt}},
url = {https://github.com/Giovo17/tfs-mt}
}