Skip to content

๐ŸŽญ Production-ready sentiment analysis API powered by fine-tuned DistilBERT

License

Notifications You must be signed in to change notification settings

IberaSoft/sentiment-analysis-api

Repository files navigation

๐ŸŽญ Customer Sentiment Analysis API

Production-ready sentiment analysis API powered by fine-tuned DistilBERT. Analyze customer reviews at scale with 90%+ accuracy.

Model Dataset Demo Python FastAPI License

๐ŸŽฏ What I learned with this project:

  • โœ… HuggingFace Expertise: Model fine-tuning, dataset creation, model hub publishing
  • โœ… Transformer Models: Fine-tuning DistilBERT for domain-specific tasks
  • โœ… Production ML: Model optimization, API deployment, monitoring
  • โœ… MLOps: Training pipelines, model versioning, A/B testing setup
  • โœ… API Development: FastAPI with async support, batch processing
  • โœ… Documentation: Comprehensive model cards, API docs, deployment guides

๐ŸŒŸ Features

graph LR
    A[Customer Review] --> B[FastAPI Endpoint]
    B --> C[Preprocessing]
    C --> D[DistilBERT Model]
    D --> E[Post-processing]
    E --> F[JSON Response]
    
    G[Batch Reviews] --> H[Async Processing]
    H --> D
    
    style D fill:#ffe1e1
    style F fill:#e1ffe1
Loading

API Capabilities

  • ๐Ÿš€ Fast Inference: < 50ms response time
  • ๐Ÿ“Š Batch Processing: Handle multiple reviews efficiently
  • ๐ŸŽฏ High Accuracy: 90.2% on test set
  • ๐Ÿ“ˆ Confidence Scores: Get prediction confidence
  • ๐Ÿ”„ Async Support: Non-blocking requests
  • ๐Ÿ“ Comprehensive Logging: Track all predictions
  • ๐Ÿณ Docker Ready: One-command deployment

Model Features

  • โšก Optimized: Quantized version available (4x smaller)
  • ๐ŸŒ Public: Published on HuggingFace Hub
  • ๐Ÿ“š Well-Documented: Complete model card
  • ๐Ÿงช Tested: 90+ unit and integration tests
  • ๐Ÿ”ง Flexible: Easy to fine-tune on your data

๐Ÿš€ Quick Start

Try the Live Demo

๐ŸŽฎ Interactive Demo on HuggingFace Spaces

Run Locally

Docker (Recommended)

git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
docker-compose up -d

# Test
curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "This product is amazing!"}'

Local Development

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000

Visit http://localhost:8000/docs for interactive API documentation.

Use Model Directly

from transformers import pipeline

classifier = pipeline(
    "sentiment-analysis",
    model="IberaSoft/customer-sentiment-analyzer"
)

result = classifier("This product exceeded my expectations!")
print(result)  # [{'label': 'positive', 'score': 0.9823}]

๐Ÿ“ก API Overview

Base URL: http://localhost:8000/api/v1

Interactive Docs: Visit /docs for Swagger UI

Quick Example

# Single prediction
curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "Great product!"}'

# Response
{
  "sentiment": "positive",
  "confidence": 0.94,
  "scores": {"positive": 0.94, "negative": 0.03, "neutral": 0.03},
  "processing_time_ms": 35
}

Main Endpoints

  • POST /predict - Analyze single text
  • POST /predict/batch - Analyze multiple texts (max 100)
  • GET /model/info - Model information and metrics
  • GET /health - Health check

Full API documentation: See docs/API.md

๐Ÿ“Š Model Performance

Metrics

Metric Score
Accuracy 90.2%
F1 Score 0.89
Precision 0.90
Recall 0.89
Inference Time 35ms (CPU)

Confusion Matrix

                Predicted
              Pos  Neu  Neg
Actual Pos  [ 728   45   27 ]
       Neu  [  38  430   32 ]
       Neg  [  22   48  630 ]

Benchmark Results

Batch Size Throughput (req/s) Latency P95 (ms)
1 28 45
8 89 120
32 156 280

Tested on Intel i7-11700K

๐Ÿ—๏ธ Architecture

flowchart TB
    subgraph Client
        A[Web/Mobile App]
        B[Backend Service]
    end
    
    subgraph API Layer
        C[FastAPI Server]
        D[Request Validation]
        E[Response Formatting]
    end
    
    subgraph ML Layer
        F[Preprocessing]
        G[DistilBERT Model]
        H[Postprocessing]
    end
    
    subgraph Storage
        I[Model Cache]
        J[Logs]
    end
    
    A --> C
    B --> C
    C --> D
    D --> F
    F --> G
    G --> H
    H --> E
    E --> C
    G <--> I
    C --> J
    
    style G fill:#ffe1e1
    style C fill:#e3f2fd
Loading

๐Ÿ› ๏ธ Tech Stack

Component Technology Purpose
ML Framework HuggingFace Transformers Model training & inference
Base Model DistilBERT Pre-trained transformer
API Framework FastAPI REST API server
Web Server Uvicorn ASGI server
Validation Pydantic Request/response validation
Testing Pytest Unit & integration tests
Load Testing Locust Performance testing
Containerization Docker Deployment
CI/CD GitHub Actions Automated testing & deployment
Monitoring Prometheus Metrics collection

๐Ÿ“ Project Structure

sentiment-analysis-api/
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ main.py                  # FastAPI application
โ”‚   โ”œโ”€โ”€ config.py                # Configuration
โ”‚   โ”œโ”€โ”€ api/
โ”‚   โ”‚   โ””โ”€โ”€ endpoints/
โ”‚   โ”‚       โ”œโ”€โ”€ predict.py       # Prediction endpoints
โ”‚   โ”‚       โ”œโ”€โ”€ batch.py         # Batch processing
โ”‚   โ”‚       โ””โ”€โ”€ health.py        # Health checks
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ model.py             # Model loading & inference
โ”‚   โ”‚   โ”œโ”€โ”€ preprocessing.py     # Text preprocessing
โ”‚   โ”‚   โ””โ”€โ”€ cache.py             # Response caching
โ”‚   โ”œโ”€โ”€ schemas/
โ”‚   โ”‚   โ”œโ”€โ”€ request.py           # Request models
โ”‚   โ”‚   โ””โ”€โ”€ response.py          # Response models
โ”‚   โ””โ”€โ”€ utils/
โ”‚       โ”œโ”€โ”€ logger.py            # Logging configuration
โ”‚       โ””โ”€โ”€ metrics.py           # Prometheus metrics
โ”‚
โ”œโ”€โ”€ training/
โ”‚   โ”œโ”€โ”€ prepare_dataset.py       # Dataset preparation
โ”‚   โ”œโ”€โ”€ train.py                 # Model training
โ”‚   โ”œโ”€โ”€ evaluate.py              # Model evaluation
โ”‚   โ”œโ”€โ”€ optimize.py              # Model optimization
โ”‚   โ””โ”€โ”€ configs/
โ”‚       โ””โ”€โ”€ training_config.yaml
โ”‚
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ unit/
โ”‚   โ”‚   โ”œโ”€โ”€ test_preprocessing.py
โ”‚   โ”‚   โ”œโ”€โ”€ test_model.py
โ”‚   โ”‚   โ””โ”€โ”€ test_api.py
โ”‚   โ”œโ”€โ”€ integration/
โ”‚   โ”‚   โ””โ”€โ”€ test_end_to_end.py
โ”‚   โ””โ”€โ”€ load/
โ”‚       โ””โ”€โ”€ locustfile.py
โ”‚
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ download_data.py
โ”‚   โ”œโ”€โ”€ upload_to_hf.py
โ”‚   โ””โ”€โ”€ benchmark.py
โ”‚
โ”œโ”€โ”€ notebooks/
โ”‚   โ”œโ”€โ”€ 01_data_exploration.ipynb
โ”‚   โ”œโ”€โ”€ 02_model_training.ipynb
โ”‚   โ””โ”€โ”€ 03_error_analysis.ipynb
โ”‚
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ API.md                  # API reference
โ”‚   โ”œโ”€โ”€ TRAINING.md             # Model training guide
โ”‚   โ”œโ”€โ”€ DEPLOYMENT.md           # Deployment options
โ”‚   โ”œโ”€โ”€ SPACES_GUIDE.md         # HuggingFace Spaces setup
โ”‚   โ”œโ”€โ”€ HF_TOKEN_GUIDE.md       # Token setup guide
โ”‚   โ””โ”€โ”€ TROUBLESHOOTING.md      # Common issues & solutions
โ”‚
โ”œโ”€โ”€ .github/
โ”‚   โ””โ”€โ”€ workflows/
โ”‚       โ”œโ”€โ”€ test.yml
โ”‚       โ””โ”€โ”€ deploy.yml
โ”‚
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ docker-compose.yml
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ requirements-dev.txt
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

๐Ÿ’ป Development

Setup

git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit install

Train Model

cd training
python prepare_dataset.py --output-dir ./data
python train.py --config configs/training_config.yaml

python evaluate.py --model-dir ./models/customer-sentiment-v1
python ../scripts/upload_to_hf.py --model-dir ./models/customer-sentiment-v1

Full training guide: See docs/TRAINING.md

Run Tests

pytest tests/ -v                          # All tests
pytest tests/ --cov=app --cov-report=html # With coverage

๐Ÿš€ Deployment

Docker

docker-compose up -d

HuggingFace Spaces

  1. Fork this repository
  2. Create Space on HuggingFace
  3. Connect GitHub repo
  4. Auto-deploys!

Spaces guide: See docs/SPACES_GUIDE.md

Cloud Deployment

Supports AWS, GCP, Azure, DigitalOcean, and more.

Full deployment guide: See docs/DEPLOYMENT.md

๐Ÿ“ˆ Monitoring

Metrics: Available at /metrics (Prometheus format)

Logging: Structured JSON logs for easy parsing

๐Ÿ”ง Customization

Fine-tune on your own data:

python training/train.py \
  --base-model IberaSoft/customer-sentiment-analyzer \
  --dataset your-username/your-dataset \
  --output-dir ./models/custom-model

See docs/TRAINING.md for details.

๐Ÿ“š Documentation

๐Ÿ”— Links

๐Ÿ—บ๏ธ Roadmap

v1.1 - Enhanced Features (Next)

  • Multi-language support (Spanish, French, German)
  • Aspect-based sentiment analysis
  • Confidence calibration improvements
  • Real-time model updates

v1.2 - Performance (Planned)

  • ONNX optimization
  • Model distillation (smaller, faster)
  • GPU batch processing
  • Response streaming

v2.0 - Advanced (Future)

  • Multi-model ensemble
  • Active learning pipeline
  • A/B testing framework
  • Explainability (SHAP, LIME)

v3.0 - Enterprise (Future)

  • Multi-tenancy support
  • Custom model training UI
  • Advanced analytics dashboard
  • SLA monitoring

๐Ÿค Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Ways to contribute:

  • ๐Ÿ› Report bugs
  • ๐Ÿ’ก Suggest features
  • ๐Ÿ“ Improve documentation
  • ๐Ÿงช Add tests
  • ๐ŸŽจ Improve UI/UX

๐Ÿ“ License

This project is licensed under the MIT License - see LICENSE for details.

๐Ÿ™ Acknowledgments

  • HuggingFace for Transformers library and model hub
  • FastAPI team for excellent framework
  • DistilBERT authors for the efficient base model
  • Community for feedback and contributions

Project Links:


โญ Star this repo if you find it useful!

Built with โค๏ธ by an aspiring AI/ML Engineer

GitHub stars GitHub forks

Try the live demo: HuggingFace Spaces

About

๐ŸŽญ Production-ready sentiment analysis API powered by fine-tuned DistilBERT

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •