A lightweight statistical language model for question-answering and text generation.
NarrowMind combines statistical n-gram modeling with modern language modeling techniques (temperature sampling, top-k sampling, TF-IDF) to provide fast, memory-efficient language understanding without neural networks.
- Question Answering: Understands questions with wildcards (who, what, where, when, why, how)
- Text Generation: Generates contextually relevant continuations
- Multi-gram Ensemble: Weighted combination of bigrams and trigrams
- TF-IDF Semantic Search: Finds semantically similar content
- Temperature & Top-k Sampling: Same concepts as GPT for controlled randomness
- Place your training text in
input.txt - Run
cargo run - Ask questions using question words as wildcards:
> who was getting ready > what did mia realize > quit
graph LR
A[Training Data] --> B[Tokenization]
B --> C[N-gram Stats]
B --> D[TF-IDF Vectors]
B --> E[Sentence Index]
F[User Query] --> G{Direct Match?}
G -->|Yes| H[Answer]
G -->|No| I[Power Set + TF-IDF Search]
I --> J[Top Sentences]
J --> K[Generate Candidates]
K --> L[TF-IDF Boost]
L --> M[Temperature + Top-k]
M --> H
Algorithm Order:
- Direct Pattern Matching - Fast exact text search
- Power Set Matching - Finds sentences matching word combinations
- TF-IDF Similarity - Semantic vector search (fallback)
- Multi-gram Ensemble - Combines bigrams & trigrams with weights
- TF-IDF Relevance Boost - 1.0-3.5x multiplier for contextually relevant words
- Temperature & Top-k Sampling - Controlled randomness
Training: "Mia was getting ready for school. She realized she forgot her homework."
Query: "who was getting ready"
Response: "Mia was getting ready for school."
let mut model = LanguageModel::new(3);
model.set_temperature(0.8); // Lower = more deterministic
model.set_top_k(20); // Limit to top 20 candidates
model.train(&training_data);- Rust 1.70+
- Training data in
input.txt - Dependencies:
randcrate only
| Feature | NarrowMind | GPT |
|---|---|---|
| Architecture | Statistical n-grams | Neural network |
| Memory | ~MBs | ~GBs to TBs |
| Speed | Instant | Slower |
| GPU Required | No | Yes |
| Temperature/Top-k | ✅ Yes | ✅ Yes |
| Semantic Search | ✅ TF-IDF | ✅ Embeddings |
NarrowMind: Think small, understand deeply. 🧠