🎭 Customer Sentiment Analysis API

Production-ready sentiment analysis API powered by fine-tuned DistilBERT. Analyze customer reviews at scale with 90%+ accuracy.

🎯 What I learned with this project:

✅ HuggingFace Expertise: Model fine-tuning, dataset creation, model hub publishing
✅ Transformer Models: Fine-tuning DistilBERT for domain-specific tasks
✅ Production ML: Model optimization, API deployment, monitoring
✅ MLOps: Training pipelines, model versioning, A/B testing setup
✅ API Development: FastAPI with async support, batch processing
✅ Documentation: Comprehensive model cards, API docs, deployment guides

🌟 Features

graph LR
    A[Customer Review] --> B[FastAPI Endpoint]
    B --> C[Preprocessing]
    C --> D[DistilBERT Model]
    D --> E[Post-processing]
    E --> F[JSON Response]
    
    G[Batch Reviews] --> H[Async Processing]
    H --> D
    
    style D fill:#ffe1e1
    style F fill:#e1ffe1

API Capabilities

🚀 Fast Inference: < 50ms response time
📊 Batch Processing: Handle multiple reviews efficiently
🎯 High Accuracy: 90.2% on test set
📈 Confidence Scores: Get prediction confidence
🔄 Async Support: Non-blocking requests
📝 Comprehensive Logging: Track all predictions
🐳 Docker Ready: One-command deployment

Model Features

⚡ Optimized: Quantized version available (4x smaller)
🌍 Public: Published on HuggingFace Hub
📚 Well-Documented: Complete model card
🧪 Tested: 90+ unit and integration tests
🔧 Flexible: Easy to fine-tune on your data

🚀 Quick Start

Try the Live Demo

🎮 Interactive Demo on HuggingFace Spaces

Run Locally

Docker (Recommended)

git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
docker-compose up -d

# Test
curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "This product is amazing!"}'

Local Development

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000

Visit http://localhost:8000/docs for interactive API documentation.

Use Model Directly

from transformers import pipeline

classifier = pipeline(
    "sentiment-analysis",
    model="IberaSoft/customer-sentiment-analyzer"
)

result = classifier("This product exceeded my expectations!")
print(result)  # [{'label': 'positive', 'score': 0.9823}]

📡 API Overview

Base URL: http://localhost:8000/api/v1

Interactive Docs: Visit /docs for Swagger UI

Quick Example

# Single prediction
curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "Great product!"}'

# Response
{
  "sentiment": "positive",
  "confidence": 0.94,
  "scores": {"positive": 0.94, "negative": 0.03, "neutral": 0.03},
  "processing_time_ms": 35
}

Main Endpoints

POST /predict - Analyze single text
POST /predict/batch - Analyze multiple texts (max 100)
GET /model/info - Model information and metrics
GET /health - Health check

Full API documentation: See docs/API.md

📊 Model Performance

Metrics

Metric	Score
Accuracy	90.2%
F1 Score	0.89
Precision	0.90
Recall	0.89
Inference Time	35ms (CPU)

Confusion Matrix

                Predicted
              Pos  Neu  Neg
Actual Pos  [ 728   45   27 ]
       Neu  [  38  430   32 ]
       Neg  [  22   48  630 ]

Benchmark Results

Batch Size	Throughput (req/s)	Latency P95 (ms)
1	28	45
8	89	120
32	156	280

Tested on Intel i7-11700K

🏗️ Architecture

flowchart TB
    subgraph Client
        A[Web/Mobile App]
        B[Backend Service]
    end
    
    subgraph API Layer
        C[FastAPI Server]
        D[Request Validation]
        E[Response Formatting]
    end
    
    subgraph ML Layer
        F[Preprocessing]
        G[DistilBERT Model]
        H[Postprocessing]
    end
    
    subgraph Storage
        I[Model Cache]
        J[Logs]
    end
    
    A --> C
    B --> C
    C --> D
    D --> F
    F --> G
    G --> H
    H --> E
    E --> C
    G <--> I
    C --> J
    
    style G fill:#ffe1e1
    style C fill:#e3f2fd

🛠️ Tech Stack

Component	Technology	Purpose
ML Framework	HuggingFace Transformers	Model training & inference
Base Model	DistilBERT	Pre-trained transformer
API Framework	FastAPI	REST API server
Web Server	Uvicorn	ASGI server
Validation	Pydantic	Request/response validation
Testing	Pytest	Unit & integration tests
Load Testing	Locust	Performance testing
Containerization	Docker	Deployment
CI/CD	GitHub Actions	Automated testing & deployment
Monitoring	Prometheus	Metrics collection

📁 Project Structure

sentiment-analysis-api/
├── app/
│   ├── main.py                  # FastAPI application
│   ├── config.py                # Configuration
│   ├── api/
│   │   └── endpoints/
│   │       ├── predict.py       # Prediction endpoints
│   │       ├── batch.py         # Batch processing
│   │       └── health.py        # Health checks
│   ├── core/
│   │   ├── model.py             # Model loading & inference
│   │   ├── preprocessing.py     # Text preprocessing
│   │   └── cache.py             # Response caching
│   ├── schemas/
│   │   ├── request.py           # Request models
│   │   └── response.py          # Response models
│   └── utils/
│       ├── logger.py            # Logging configuration
│       └── metrics.py           # Prometheus metrics
│
├── training/
│   ├── prepare_dataset.py       # Dataset preparation
│   ├── train.py                 # Model training
│   ├── evaluate.py              # Model evaluation
│   ├── optimize.py              # Model optimization
│   └── configs/
│       └── training_config.yaml
│
├── tests/
│   ├── unit/
│   │   ├── test_preprocessing.py
│   │   ├── test_model.py
│   │   └── test_api.py
│   ├── integration/
│   │   └── test_end_to_end.py
│   └── load/
│       └── locustfile.py
│
├── scripts/
│   ├── download_data.py
│   ├── upload_to_hf.py
│   └── benchmark.py
│
├── notebooks/
│   ├── 01_data_exploration.ipynb
│   ├── 02_model_training.ipynb
│   └── 03_error_analysis.ipynb
│
├── docs/
│   ├── API.md                  # API reference
│   ├── TRAINING.md             # Model training guide
│   ├── DEPLOYMENT.md           # Deployment options
│   ├── SPACES_GUIDE.md         # HuggingFace Spaces setup
│   ├── HF_TOKEN_GUIDE.md       # Token setup guide
│   └── TROUBLESHOOTING.md      # Common issues & solutions
│
├── .github/
│   └── workflows/
│       ├── test.yml
│       └── deploy.yml
│
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── requirements-dev.txt
├── .env.example
├── README.md
└── LICENSE

💻 Development

Setup

git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit install

Train Model

cd training
python prepare_dataset.py --output-dir ./data
python train.py --config configs/training_config.yaml

python evaluate.py --model-dir ./models/customer-sentiment-v1
python ../scripts/upload_to_hf.py --model-dir ./models/customer-sentiment-v1

Full training guide: See docs/TRAINING.md

Run Tests

pytest tests/ -v                          # All tests
pytest tests/ --cov=app --cov-report=html # With coverage

🚀 Deployment

Docker

docker-compose up -d

HuggingFace Spaces

Fork this repository
Create Space on HuggingFace
Connect GitHub repo
Auto-deploys!

Spaces guide: See docs/SPACES_GUIDE.md

Cloud Deployment

Supports AWS, GCP, Azure, DigitalOcean, and more.

Full deployment guide: See docs/DEPLOYMENT.md

📈 Monitoring

Metrics: Available at /metrics (Prometheus format)

Logging: Structured JSON logs for easy parsing

🔧 Customization

Fine-tune on your own data:

python training/train.py \
  --base-model IberaSoft/customer-sentiment-analyzer \
  --dataset your-username/your-dataset \
  --output-dir ./models/custom-model

See docs/TRAINING.md for details.

📚 Documentation

API Reference - Complete API documentation
Training Guide - Train and fine-tune the model
Deployment Guide - Deploy to production
Spaces Guide - HuggingFace Spaces setup
Token Guide - HuggingFace token setup
Troubleshooting - Common issues & solutions

🔗 Links

🗺️ Roadmap

v1.1 - Enhanced Features (Next)

Multi-language support (Spanish, French, German)
Aspect-based sentiment analysis
Confidence calibration improvements
Real-time model updates

v1.2 - Performance (Planned)

ONNX optimization
Model distillation (smaller, faster)
GPU batch processing
Response streaming

v2.0 - Advanced (Future)

Multi-model ensemble
Active learning pipeline
A/B testing framework
Explainability (SHAP, LIME)

v3.0 - Enterprise (Future)

Multi-tenancy support
Custom model training UI
Advanced analytics dashboard
SLA monitoring

🤝 Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Ways to contribute:

🐛 Report bugs
💡 Suggest features
📝 Improve documentation
🧪 Add tests
🎨 Improve UI/UX

📝 License

This project is licensed under the MIT License - see LICENSE for details.

🙏 Acknowledgments

HuggingFace for Transformers library and model hub
FastAPI team for excellent framework
DistilBERT authors for the efficient base model
Community for feedback and contributions

Project Links:

🤗 Model
📊 Dataset
🎮 Demo

⭐ Star this repo if you find it useful!

Built with ❤️ by an aspiring AI/ML Engineer

Try the live demo: HuggingFace Spaces

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github		.github
app		app
docs		docs
scripts		scripts
tests		tests
training		training
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements-training.txt		requirements-training.txt
requirements.txt		requirements.txt

License

IberaSoft/sentiment-analysis-api

Folders and files

Latest commit

History

Repository files navigation

🎭 Customer Sentiment Analysis API

🎯 What I learned with this project:

🌟 Features

API Capabilities

Model Features

🚀 Quick Start

Try the Live Demo

Run Locally

Use Model Directly

📡 API Overview

Quick Example

Main Endpoints

📊 Model Performance

Metrics

Confusion Matrix

Benchmark Results

🏗️ Architecture

🛠️ Tech Stack

📁 Project Structure

💻 Development

Setup

Train Model

Run Tests

🚀 Deployment

Docker

HuggingFace Spaces

Cloud Deployment

📈 Monitoring

🔧 Customization

📚 Documentation

🔗 Links

🗺️ Roadmap

v1.1 - Enhanced Features (Next)

v1.2 - Performance (Planned)

v2.0 - Advanced (Future)

v3.0 - Enterprise (Future)

🤝 Contributing

📝 License

🙏 Acknowledgments

⭐ Star this repo if you find it useful!

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages