Skip to content

neomatrix369/AIE7-Cert-Challenge

Repository files navigation

Federal Student Loan AI Assistant

RAG AIE7 Certified Student Loans

Python TypeScript Next.js FastAPI Docker

LangChain LangGraph OpenAI Qdrant RAGAS Cohere


AIE7 Certification Challenge - Advanced RAG system for federal student loan customer service

An intelligent assistant that combines official federal loan policies with real customer experiences to provide comprehensive guidance on student loan questions, repayment options, forgiveness programs, and servicer issues.

Visuals

: App Frontend Page : : Frontend Chat Interface :

πŸš€ New: Complete Docker Orchestration

Get the entire RAG system running with a single command! All services (Vector DB + Backend API + Jupyter + Frontend) are now fully containerized with automated service management, health checks, and persistent volumes.


Written report can be found here

Loom video can be found here


πŸ“– Table of Contents

πŸš€ Quick Start

Get the entire RAG system running in 2 simple steps:

1. Environment Setup

# Copy environment template and add your API keys
cp .env-example .env

# Edit .env file with your required API keys:
# OPENAI_API_KEY=your_key_here (Required)
# COHERE_API_KEY=your_key_here (Required)  
# TAVILY_API_KEY=your_key_here (Required)
# LANGCHAIN_API_KEY=your_key_here (Optional - for tracing)

2. Start All Services with Docker

# πŸš€ Interactive Menu (Default) - Choose your startup mode
./start-services.sh

# The script presents 4 startup options:
# 1. πŸš€ Full startup (recommended)
#    β€’ Stops existing containers
#    β€’ Cleans up: dangling images, build cache
#    β€’ Rebuilds and starts all services
#    β€’ Includes: Backend + Jupyter + Frontend + Qdrant
#
# 2. ⚑ Quick restart (development)
#    β€’ Skips Docker cleanup (faster)
#    β€’ Rebuilds and starts all services
#    β€’ Best for active development
#
# 3. πŸ”¬ Backend + Jupyter only
#    β€’ Skips frontend service
#    β€’ Ideal for notebook experiments
#    β€’ Faster startup
#
# 4. 🎯 Custom configuration
#    β€’ Choose individual options interactively
#    β€’ Skip cleanup? Skip frontend?

# ⚑ Non-Interactive Mode (for automation/scripts)
./start-services.sh --mode=full              # Full startup with cleanup
./start-services.sh --mode=quick             # Quick restart (no cleanup)
./start-services.sh --mode=backend           # Backend + Jupyter only
./start-services.sh --non-interactive        # Non-interactive full startup

# πŸ”§ Additional flags
./start-services.sh --skip-cleanup           # Custom: skip cleanup
./start-services.sh --no-frontend            # Custom: skip frontend
./start-services.sh --help                   # Show all options

# Alternative: Manual Docker Compose
docker compose up --build -d

πŸŽ‰ Single Command Deployment! All services start automatically with:

  • βœ… Service Dependencies - Proper startup ordering
  • βœ… Health Checks - Automated service validation
  • βœ… Data Persistence - Volumes for cache and data
  • βœ… Network Isolation - Dedicated Docker network
  • βœ… Multi-stage Builds - Optimized container images
  • βœ… Automatic Cleanup - Docker image/cache management

Services Available:

⏹️ Stop All Services

# πŸ›‘ Interactive Menu (Default) - Choose your stop method
./stop-services.sh

# The script presents 4 stop options:
# 1. πŸ›‘ Standard stop (recommended for daily use)
#    β€’ Stops all containers
#    β€’ Cleans up: dangling images, build cache
#    β€’ Preserves: stopped containers, volumes, used images
#
# 2. ⏸️  Quick pause (fastest restart)
#    β€’ Stops containers only
#    β€’ No cleanup performed
#    β€’ Next startup will be faster
#
# 3. πŸ”§ Deep cleanup (reclaim disk space)
#    β€’ Stops and removes containers
#    β€’ Cleans up: dangling images, build cache
#    β€’ Preserves: volumes (your data)
#
# 4. πŸ’£ Nuclear reset (⚠️  DATA LOSS WARNING)
#    β€’ Removes containers AND volumes
#    β€’ ⚠️  DELETES: Vector DB, cache, notebooks
#    β€’ Use only when starting completely fresh

# ⚑ Non-Interactive Mode (for automation/scripts)
./stop-services.sh --mode=standard           # Standard stop with cleanup
./stop-services.sh --mode=quick              # Quick pause (no cleanup)
./stop-services.sh --mode=deep               # Deep cleanup
./stop-services.sh --mode=nuclear            # Nuclear reset ⚠️
./stop-services.sh --non-interactive         # Non-interactive standard

# πŸ”§ Legacy flags (backward compatible)
./stop-services.sh --skip-cleanup            # Maps to --mode=quick
./stop-services.sh --remove                  # Maps to --mode=deep
./stop-services.sh --clean                   # Maps to --mode=nuclear
./stop-services.sh --help                    # Show all options

# πŸ”§ Alternative: Direct Docker Compose
docker compose down                   # Stop and remove containers (keep volumes)

πŸ“Š Initialization Progress

The system loads a hybrid dataset (749 documents β†’ 2,172 chunks β†’ vector embeddings):

  • ⏱️ Startup Time: 60-90 seconds for full RAG agent initialization
  • πŸ“ˆ Progress Monitoring: Watch logs via docker compose logs -f backend
  • 🎯 Ready Indicator: Backend health endpoint returns "status": "healthy"

πŸ”§ Service Management

# View logs for all services
docker compose logs -f

# View logs for specific service
docker compose logs -f backend
docker compose logs -f jupyter
docker compose logs -f frontend
docker compose logs -f qdrant

# Check service health status
docker compose ps

# Restart specific service
docker compose restart backend

# 🧹 Service Management with Interactive Menus
./start-services.sh              # Interactive startup menu (4 options)
./stop-services.sh               # Interactive stop menu (4 options)
./start-services.sh --help       # Show all startup options
./stop-services.sh --help        # Show all shutdown options

# Non-interactive examples (for scripts/automation):
./start-services.sh --mode=quick       # Quick restart for development
./stop-services.sh --mode=quick        # Fast stop without cleanup

# Scale services (if needed)
docker compose up --scale backend=2 -d

3. Open & Use the System

Once all services are running, you can access:

🎭 Frontend Features

The web interface includes advanced persona-based interactions:

  • πŸ‘₯ Multi-Persona Support: Student, Parent, Financial Counselor, Loan Servicer roles
  • πŸ“ Context-Aware Questions: Pre-built question templates per persona
  • πŸ’¬ Session Management: Persistent chat sessions across role changes
  • πŸ“Š Performance Transparency: Response times, token usage, and source relevance scores
  • 🎨 Professional UI: Clean design with role-specific styling and tooltips
  • ⚑ Real-time Responses: Live streaming of responses with cancel functionality

4. Running RAG Experiments

Option 1: Using Docker (Recommended)

# Jupyter is already running at http://localhost:8888
# Open the notebooks directly in your browser

Option 2: Local Development

# Install dependencies locally
uv sync

# Start Jupyter from project root
uv run jupyter lab

When inside Jupyter Labs, you can access the main evaluation notebook:

✨ Core Features

🎯 AI-Powered Student Loan Expertise

  • Federal Loan Expert - Trained on official policies + real customer complaints
  • Multi-Persona Interface - Role-based interactions (Student, Parent, Counselor, etc.)
  • Context-Aware Responses - Understands user focus and provides targeted guidance
  • Source Transparency - Shows relevance scores and document sources for all answers

πŸ” Advanced RAG Architecture

  • Hybrid Dataset - PDF policies + CSV complaints for comprehensive knowledge
  • Multiple Retrieval Methods - Naive, Multi-Query, Parent-Document, Contextual Compression
  • Agent Orchestration - LangGraph-based tool selection and workflow management
  • Performance Evaluation - RAGAS metrics with comprehensive benchmarking

πŸš€ Production-Ready Deployment

  • Complete Docker Orchestration - Multi-service containerization with health checks
  • One-Command Deployment - Automated service management and startup
  • Auto-Scaling Architecture - Horizontal scaling with load balancing support
  • Real-time Monitoring - Qdrant dashboard, health endpoints, and comprehensive logging

πŸ’¬ Enhanced User Experience

  • Interactive Chat Interface - Clean, responsive web interface with session management
  • Role-Based Personas - Tailored question templates and response styles
  • Performance Metrics - Real-time response times, token usage, and source tracking
  • RESTful API - Production-ready /ask endpoint with comprehensive metrics

πŸ“ Project Structure

β”œβ”€β”€ πŸ“ Core Application
β”‚   β”œβ”€β”€ src/backend/              # FastAPI server with RAG endpoint
β”‚   β”œβ”€β”€ src/core/                 # RAG retrieval implementations
β”‚   β”œβ”€β”€ src/agents/               # LangGraph agent orchestration
β”‚   └── src/utils/                # Utilities and helper functions
β”œβ”€β”€ 🎨 Frontend
β”‚   └── frontend/                 # Next.js chat interface  
β”œβ”€β”€ πŸ“Š Data & Analysis
β”‚   β”œβ”€β”€ data/                     # Federal loan PDFs + complaints CSV
β”‚   β”œβ”€β”€ notebooks/                # Jupyter research & evaluation
β”‚   β”œβ”€β”€ golden-masters/           # Generated test datasets
β”‚   └── metrics/                  # Performance evaluation results
β”œβ”€β”€ 🐳 Docker Infrastructure
β”‚   β”œβ”€β”€ docker-compose.yml        # Multi-service orchestration
β”‚   β”œβ”€β”€ start-services.sh         # Automated deployment script
β”‚   β”œβ”€β”€ stop-services.sh          # Graceful shutdown script
β”‚   └── setup.sh                  # Development setup utilities
β”œβ”€β”€ πŸ“š Documentation
β”‚   β”œβ”€β”€ docs/                     # Project documentation
β”‚   β”œβ”€β”€ README.md                 # Main project documentation
β”‚   └── CLAUDE.md                 # Development guidelines
└── βš™οΈ Configuration
    β”œβ”€β”€ .env-example              # Environment variables template
    β”œβ”€β”€ pyproject.toml            # Python dependencies (uv)
    └── uv.lock                   # Locked dependency versions

πŸ”— API Usage

POST /ask - Ask any federal student loan question

{
  "question": "What are income-driven repayment plans?",
  "max_response_length": 2000
}

Response includes:

  • Generated answer with contextual sources and relevance scores
  • Comprehensive performance metrics (response time, tokens used, retrieval method)
  • Source document transparency with relevance scoring
  • Tool usage tracking and agent decision logs

πŸ›  Development

This project implements cutting-edge RAG techniques with comprehensive evaluation:

πŸ§ͺ Advanced RAG Research

  • Hybrid Dataset: Official policies + real customer scenarios (4,547 β†’ 825 β†’ 480 quality-filtered records)
  • Multiple Retrievers: Naive (best performer), Multi-Query, Parent-Document, Contextual Compression
  • Agent Framework: LangGraph with StateGraph orchestration and tool selection
  • Evaluation Pipeline: RAGAS metrics with 6 core measurements (all higher=better scale)

πŸ“Š Performance Analysis

  • Retrieval Ranking: Comprehensive benchmarking across all methods
  • Visualization Tools: Heatmap generation for metric pattern analysis
  • Golden Master datasets: Cached evaluation datasets to avoid regeneration
  • Performance Tracking: Response times, token usage, and retrieval quality metrics

πŸ”¬ Research Notebooks

  • Agentic RAG Evaluation: Main notebook for agent-based experiments
  • Retriever Comparison: Traditional retrieval method benchmarking
  • Performance Visualization: Heatmap and metric analysis tools

Component Documentation

Docker Architecture

πŸ—οΈ Multi-Service Architecture:

  • Qdrant (Vector Database) - Persistent storage for embeddings
  • Backend (FastAPI + RAG Agent) - Python API with LangGraph orchestration
  • Jupyter (Analysis Environment) - Notebook server for experiments
  • Frontend (Next.js) - React-based chat interface

πŸ”§ Advanced Features:

  • Multi-stage Docker builds with uv for optimized Python dependencies
  • Service health checks with automatic restart policies
  • Persistent volumes for data, cache, and Jupyter notebooks
  • Network isolation with dedicated Docker network
  • Environment-based configuration for local/production deployment

πŸš€ Deployment Options:

# 🌟 Interactive Deployment (Recommended)
./start-services.sh                    # Interactive menu with 4 startup modes
./stop-services.sh                     # Interactive menu with 4 stop methods

# ⚑ Non-Interactive Deployment (for automation)
./start-services.sh --mode=full        # Full startup with cleanup
./start-services.sh --mode=quick       # Quick restart (no cleanup)
./start-services.sh --mode=backend     # Backend + Jupyter only
./start-services.sh --mode=custom --skip-cleanup --no-frontend  # Custom flags
./start-services.sh --help             # See all options

# πŸ›‘ Non-Interactive Shutdown
./stop-services.sh --mode=standard     # Standard stop with cleanup
./stop-services.sh --mode=quick        # Quick pause (no cleanup)
./stop-services.sh --mode=deep         # Deep cleanup (remove containers)
./stop-services.sh --mode=nuclear      # Nuclear reset ⚠️ DELETES ALL DATA
./stop-services.sh --help              # See all options

# Manual: Individual services
docker compose up qdrant backend jupyter frontend -d

# Development: Backend + Qdrant only
docker compose up qdrant backend -d

πŸ“‹ Requirements

System Requirements

  • Docker (20.10+) - For containerized deployment
  • Docker Compose (2.0+) - For multi-container orchestration
  • Git (2.25+) - For cloning the repository
  • Modern Browser - Chrome, Firefox, Safari, or Edge
  • Memory: 4GB+ RAM (8GB+ recommended for optimal performance)
  • Storage: 3GB+ free space (for images, data, and dependencies)

Development Requirements (if not using Docker)

Backend Requirements

  • Python 3.11+ (tested with 3.11, minimum 3.8)
  • System Dependencies (for native installation):
    • gcc and g++ (build tools)
    • curl (for health checks)
    • build-essential (Linux/WSL)

Frontend Requirements

  • Node.js 18+ with npm
  • Next.js 14.2+ (React framework)
  • TypeScript 5.1+ (for type safety)

API Keys (Required)

Create a .env file in the project root with:

# Required API Keys
OPENAI_API_KEY=your_openai_key_here          # For LLM and embeddings
COHERE_API_KEY=your_cohere_key_here          # For reranking functionality

# Optional API Keys
TAVILY_API_KEY=your_tavily_key_here          # For external search (optional)
LANGCHAIN_API_KEY=your_langsmith_key_here    # For tracing/monitoring (optional)

General environment setup dependencies

Core Tools

  • Git (2.25+) - Version control and repository cloning
  • Docker (20.10+) - Container runtime and orchestration
  • Docker Compose (2.0+) - Multi-container application management
  • curl or wget - For API testing and health checks

Native environment setup dependencies

Python Dependencies (Backend)

The backend requires 50+ Python packages including:

  • AI/ML: langchain, langgraph, openai, cohere, ragas
  • Vector DB: qdrant-client, langchain-qdrant
  • Document Processing: pypdf2, pymupdf, unstructured
  • Web Framework: fastapi, uvicorn, pydantic
  • Data Science: numpy, pandas, matplotlib, seaborn
  • Search: tavily-python, rank-bm25
  • Utilities: python-dotenv, joblib, tqdm

Node.js Dependencies (Frontend)

  • React 18.2+ with Next.js framework
  • UI Components: lucide-react (icons)
  • Styling: tailwindcss, autoprefixer, postcss
  • Development: TypeScript, ESLint, development server

Hardware Recommendations

  • Memory: 4GB+ RAM (8GB+ recommended for better performance)
  • Storage: 2GB+ free space for dependencies and data
  • CPU: Multi-core processor recommended for faster RAG processing
  • Network: Stable internet connection for API calls

Port Usage

Service Port URL Purpose
Qdrant 6333 http://localhost:6333/dashboard Vector database dashboard
Backend API 8000 http://localhost:8000 RAG API endpoints
API Documentation 8000 http://localhost:8000/docs OpenAPI/Swagger docs
Jupyter Lab 8888 http://localhost:8888 Notebook environment
Frontend 3000 http://localhost:3000 Web interface

πŸ”’ Network Configuration:

  • All services run on isolated Docker network student-loan-network
  • Only necessary ports exposed to host machine
  • Internal service communication via Docker DNS

Data Requirements & Architecture

πŸ“Š Hybrid Dataset Pipeline:

πŸ“„ PDF Documents (4 files, ~4MB)
     ↓ DirectoryLoader + PyMuPDFLoader
   269 PDF pages
     ↓ RecursiveCharacterTextSplitter (750 chars)
   615 PDF chunks

πŸ“Š CSV Complaints (~12MB)
     ↓ CSVLoader + Quality Filtering  
4,547 raw β†’ 825 loaded β†’ 480 filtered (58% retention)
     ↓ RecursiveCharacterTextSplitter (750 chars)
 1,557 CSV chunks

     ↓ Combined Hybrid Dataset
 2,172 total chunks β†’ OpenAI Embeddings β†’ Qdrant Vector Store

🎯 Quality Filtering (Complaints):

  • ❌ Narratives < 100 characters (34 removed)
  • ❌ Excessive redaction (>5 XXXX tokens) (311 removed)
  • ❌ Empty/None/N/A content (0 removed)
  • βœ… Final retention: 58.2% (480/825 loaded records)

πŸ’Ύ Storage & Memory:

  • Vector embeddings: ~39.2MB in Qdrant (in-memory)
  • Docker volumes: Persistent cache, data, and notebooks
  • Total footprint: ~2GB including all container images

πŸš€ Performance & Scaling

⚑ Response Times (Typical):

  • RAG Query Processing: 3-8 seconds
  • Backend Initialization: 60-90 seconds
  • Container Startup: 10-30 seconds per service

πŸ“ˆ Throughput Capabilities:

  • Concurrent Users: 10-50 (single backend instance)
  • Vector Search: Sub-second retrieval from 2,172 chunks
  • Memory Usage: ~500MB per backend container

πŸ”§ Scaling & Maintenance:

# Scale backend for higher throughput
docker compose up --scale backend=3 -d

# Production: External Qdrant cluster
export QDRANT_URL=http://your-qdrant-cluster:6333
./start-services.sh

# 🧹 Docker disk space management with interactive menus
./start-services.sh              # Interactive menu: choose Full (with cleanup) or Quick (no cleanup)
./stop-services.sh               # Interactive menu: 4 cleanup levels

# Non-interactive cleanup options
./start-services.sh --mode=full        # Cleans before starting
./start-services.sh --mode=quick       # Skip cleanup (faster)
./stop-services.sh --mode=standard     # Cleans dangling images/cache
./stop-services.sh --mode=quick        # No cleanup (fastest)
./stop-services.sh --mode=deep         # Aggressive cleanup
./stop-services.sh --mode=nuclear      # Full cleanup ⚠️ DELETES ALL DATA

# Manual cleanup if needed
docker system prune -f           # Remove unused images/containers
docker builder prune -f          # Clear build cache

# πŸ“Š Monitor disk usage
docker system df                     # Show Docker disk usage
docker images | grep student-loan    # Show project images

Ready to help students navigate federal loan complexities with AI-powered guidance! πŸŽ“


πŸ“Š Performance & Metrics

Context Recall Response Time Dataset Size Memory Efficient

License GitHub