Production-ready sentiment analysis API powered by fine-tuned DistilBERT. Analyze customer reviews at scale with 90%+ accuracy.
- โ HuggingFace Expertise: Model fine-tuning, dataset creation, model hub publishing
- โ Transformer Models: Fine-tuning DistilBERT for domain-specific tasks
- โ Production ML: Model optimization, API deployment, monitoring
- โ MLOps: Training pipelines, model versioning, A/B testing setup
- โ API Development: FastAPI with async support, batch processing
- โ Documentation: Comprehensive model cards, API docs, deployment guides
graph LR
A[Customer Review] --> B[FastAPI Endpoint]
B --> C[Preprocessing]
C --> D[DistilBERT Model]
D --> E[Post-processing]
E --> F[JSON Response]
G[Batch Reviews] --> H[Async Processing]
H --> D
style D fill:#ffe1e1
style F fill:#e1ffe1
- ๐ Fast Inference: < 50ms response time
- ๐ Batch Processing: Handle multiple reviews efficiently
- ๐ฏ High Accuracy: 90.2% on test set
- ๐ Confidence Scores: Get prediction confidence
- ๐ Async Support: Non-blocking requests
- ๐ Comprehensive Logging: Track all predictions
- ๐ณ Docker Ready: One-command deployment
- โก Optimized: Quantized version available (4x smaller)
- ๐ Public: Published on HuggingFace Hub
- ๐ Well-Documented: Complete model card
- ๐งช Tested: 90+ unit and integration tests
- ๐ง Flexible: Easy to fine-tune on your data
๐ฎ Interactive Demo on HuggingFace Spaces
Docker (Recommended)
git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
docker-compose up -d
# Test
curl -X POST "http://localhost:8000/api/v1/predict" \
-H "Content-Type: application/json" \
-d '{"text": "This product is amazing!"}'Local Development
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000Visit http://localhost:8000/docs for interactive API documentation.
from transformers import pipeline
classifier = pipeline(
"sentiment-analysis",
model="IberaSoft/customer-sentiment-analyzer"
)
result = classifier("This product exceeded my expectations!")
print(result) # [{'label': 'positive', 'score': 0.9823}]Base URL: http://localhost:8000/api/v1
Interactive Docs: Visit /docs for Swagger UI
# Single prediction
curl -X POST "http://localhost:8000/api/v1/predict" \
-H "Content-Type: application/json" \
-d '{"text": "Great product!"}'
# Response
{
"sentiment": "positive",
"confidence": 0.94,
"scores": {"positive": 0.94, "negative": 0.03, "neutral": 0.03},
"processing_time_ms": 35
}POST /predict- Analyze single textPOST /predict/batch- Analyze multiple texts (max 100)GET /model/info- Model information and metricsGET /health- Health check
Full API documentation: See docs/API.md
| Metric | Score |
|---|---|
| Accuracy | 90.2% |
| F1 Score | 0.89 |
| Precision | 0.90 |
| Recall | 0.89 |
| Inference Time | 35ms (CPU) |
Predicted
Pos Neu Neg
Actual Pos [ 728 45 27 ]
Neu [ 38 430 32 ]
Neg [ 22 48 630 ]
| Batch Size | Throughput (req/s) | Latency P95 (ms) |
|---|---|---|
| 1 | 28 | 45 |
| 8 | 89 | 120 |
| 32 | 156 | 280 |
Tested on Intel i7-11700K
flowchart TB
subgraph Client
A[Web/Mobile App]
B[Backend Service]
end
subgraph API Layer
C[FastAPI Server]
D[Request Validation]
E[Response Formatting]
end
subgraph ML Layer
F[Preprocessing]
G[DistilBERT Model]
H[Postprocessing]
end
subgraph Storage
I[Model Cache]
J[Logs]
end
A --> C
B --> C
C --> D
D --> F
F --> G
G --> H
H --> E
E --> C
G <--> I
C --> J
style G fill:#ffe1e1
style C fill:#e3f2fd
| Component | Technology | Purpose |
|---|---|---|
| ML Framework | HuggingFace Transformers | Model training & inference |
| Base Model | DistilBERT | Pre-trained transformer |
| API Framework | FastAPI | REST API server |
| Web Server | Uvicorn | ASGI server |
| Validation | Pydantic | Request/response validation |
| Testing | Pytest | Unit & integration tests |
| Load Testing | Locust | Performance testing |
| Containerization | Docker | Deployment |
| CI/CD | GitHub Actions | Automated testing & deployment |
| Monitoring | Prometheus | Metrics collection |
sentiment-analysis-api/
โโโ app/
โ โโโ main.py # FastAPI application
โ โโโ config.py # Configuration
โ โโโ api/
โ โ โโโ endpoints/
โ โ โโโ predict.py # Prediction endpoints
โ โ โโโ batch.py # Batch processing
โ โ โโโ health.py # Health checks
โ โโโ core/
โ โ โโโ model.py # Model loading & inference
โ โ โโโ preprocessing.py # Text preprocessing
โ โ โโโ cache.py # Response caching
โ โโโ schemas/
โ โ โโโ request.py # Request models
โ โ โโโ response.py # Response models
โ โโโ utils/
โ โโโ logger.py # Logging configuration
โ โโโ metrics.py # Prometheus metrics
โ
โโโ training/
โ โโโ prepare_dataset.py # Dataset preparation
โ โโโ train.py # Model training
โ โโโ evaluate.py # Model evaluation
โ โโโ optimize.py # Model optimization
โ โโโ configs/
โ โโโ training_config.yaml
โ
โโโ tests/
โ โโโ unit/
โ โ โโโ test_preprocessing.py
โ โ โโโ test_model.py
โ โ โโโ test_api.py
โ โโโ integration/
โ โ โโโ test_end_to_end.py
โ โโโ load/
โ โโโ locustfile.py
โ
โโโ scripts/
โ โโโ download_data.py
โ โโโ upload_to_hf.py
โ โโโ benchmark.py
โ
โโโ notebooks/
โ โโโ 01_data_exploration.ipynb
โ โโโ 02_model_training.ipynb
โ โโโ 03_error_analysis.ipynb
โ
โโโ docs/
โ โโโ API.md # API reference
โ โโโ TRAINING.md # Model training guide
โ โโโ DEPLOYMENT.md # Deployment options
โ โโโ SPACES_GUIDE.md # HuggingFace Spaces setup
โ โโโ HF_TOKEN_GUIDE.md # Token setup guide
โ โโโ TROUBLESHOOTING.md # Common issues & solutions
โ
โโโ .github/
โ โโโ workflows/
โ โโโ test.yml
โ โโโ deploy.yml
โ
โโโ Dockerfile
โโโ docker-compose.yml
โโโ requirements.txt
โโโ requirements-dev.txt
โโโ .env.example
โโโ README.md
โโโ LICENSE
git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit installcd training
python prepare_dataset.py --output-dir ./data
python train.py --config configs/training_config.yaml
python evaluate.py --model-dir ./models/customer-sentiment-v1
python ../scripts/upload_to_hf.py --model-dir ./models/customer-sentiment-v1Full training guide: See docs/TRAINING.md
pytest tests/ -v # All tests
pytest tests/ --cov=app --cov-report=html # With coveragedocker-compose up -d- Fork this repository
- Create Space on HuggingFace
- Connect GitHub repo
- Auto-deploys!
Spaces guide: See docs/SPACES_GUIDE.md
Supports AWS, GCP, Azure, DigitalOcean, and more.
Full deployment guide: See docs/DEPLOYMENT.md
Metrics: Available at /metrics (Prometheus format)
Logging: Structured JSON logs for easy parsing
Fine-tune on your own data:
python training/train.py \
--base-model IberaSoft/customer-sentiment-analyzer \
--dataset your-username/your-dataset \
--output-dir ./models/custom-modelSee docs/TRAINING.md for details.
- API Reference - Complete API documentation
- Training Guide - Train and fine-tune the model
- Deployment Guide - Deploy to production
- Spaces Guide - HuggingFace Spaces setup
- Token Guide - HuggingFace token setup
- Troubleshooting - Common issues & solutions
- ๐ค Model on HuggingFace
- ๐ Dataset on HuggingFace
- ๐ฎ Live Demo
- Multi-language support (Spanish, French, German)
- Aspect-based sentiment analysis
- Confidence calibration improvements
- Real-time model updates
- ONNX optimization
- Model distillation (smaller, faster)
- GPU batch processing
- Response streaming
- Multi-model ensemble
- Active learning pipeline
- A/B testing framework
- Explainability (SHAP, LIME)
- Multi-tenancy support
- Custom model training UI
- Advanced analytics dashboard
- SLA monitoring
Contributions welcome! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- ๐ Report bugs
- ๐ก Suggest features
- ๐ Improve documentation
- ๐งช Add tests
- ๐จ Improve UI/UX
This project is licensed under the MIT License - see LICENSE for details.
- HuggingFace for Transformers library and model hub
- FastAPI team for excellent framework
- DistilBERT authors for the efficient base model
- Community for feedback and contributions
Project Links:
Built with โค๏ธ by an aspiring AI/ML Engineer
Try the live demo: HuggingFace Spaces