This project implements a Character-Level Recurrent Neural Network (RNN) using PyTorch to solve a sequence generation task. The model is trained to learn the statistical patterns of names from a corpus (names.txt) and then generate new, unique, and plausible-sounding names one character at a time.
The project demonstrates fundamental concepts of sequence modeling, including character tokenization, fixed-length padding, and temperature-based sampling for text generation.
The core of the project is a single-layer RNN model:
- Embedding Layer: Maps each input character index into a dense vector space.
- RNN Layer: A standard PyTorch
nn.RNNlayer (though easily swappable withnn.LSTMornn.GRU) processes the input sequence and captures sequential dependencies. - Linear Layer (Output): Maps the hidden state of the RNN at each time step to a probability distribution over the entire vocabulary (character set).
The model is trained on sequences augmented with special tokens (<, >, _) representing Start-of-Sequence (SOS), End-of-Sequence (EOS), and Padding (PAD).
The project follows a modular structure, separating concerns into dedicated files:
rnn_names_project/
├── data/
│ └── names.txt # The dataset file containing names (one per line).
├── config.py # Configuration constants (hidden size, batch size, learning rate, paths).
├── dataset.py # Logic for text processing, vocabulary building, and converting batches of names into PyTorch tensors (matrix padding).
├── model.py # Definition of the CharRNN neural network class.
├── train.py # The main script for launching the training loop and saving model artifacts.
└── generate.py # Script for loading the trained model and generating new names using temperature sampling.
Install the required libraries:
pip install torch numpyEnsure the names.txt file is placed inside the data/ directory as specified in config.py.
Run the training script to learn the name patterns:
python train.pyThis script will:
- Load names and build the character vocabulary.
- Train the
CharRNNfor the number of steps defined inconfig.py. - Save the model weights (
char_rnn_model.pth) and the vocabulary object (vocab.pt) required for inference.
Use the generate.py script to sample new names from the trained model:
python generate.pyThe script demonstrates name generation using different temperatures (which controls the creativity/randomness of the generated output) and using an optional prefix (seeding the generation process).