AI-Powered Conversational Database Assistant with Hybrid Memory System
Transform natural language questions into SQL queries with AI-powered insights, conversation memory, and intelligent optimization suggestions.
- π§ RAG-Powered SQL Generation - Semantic schema retrieval using ChromaDB vector database
- π¬ Hybrid Memory System - Redis for short-term + Mem0/Qdrant for long-term semantic memory
- π Conversation Context - Understands follow-up questions like "filter them to California"
- π AI-Driven Insights - GPT-4 powered data analysis and business recommendations
- β‘ Query Optimization - Automatic performance suggestions and indexing recommendations
- π Beginner-Friendly Explanations - Plain English SQL explanations
- π¨ Modern UI - Beautiful React interface with real-time results
- Two-Tier Memory Architecture: Redis (fast, recent) + Mem0/Qdrant (semantic, permanent)
- RAG Implementation: Retrieves only relevant table schemas using semantic search
- Multi-Step AI Pipeline: SQL generation β Execution β Analysis β Insights
- Production-Ready: Error handling, logging, session management, TTL caching
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React) β
β Natural Language Interface β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 1. Memory Retrieval (Redis + Mem0) β β
β β β’ Short-term: Recent conversation β β
β β β’ Long-term: Semantic search β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 2. Schema Retrieval (RAG/ChromaDB) β β
β β β’ Semantic search for tables β β
β β β’ Top-k relevant schemas β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 3. SQL Generation (GPT-4 + LangChain) β β
β β β’ Context-aware query creation β β
β β β’ SQLite syntax optimization β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 4. Query Execution (SQLite) β β
β β β’ Parameterized queries β β
β β β’ Performance timing β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 5. AI Analysis (GPT-4) β β
β β β’ Explanation generation β β
β β β’ Optimization suggestions β β
β β β’ Business insights β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β 6. Memory Storage (Redis + Mem0) β β
β β β’ Store for future reference β β
β β β’ TTL management β β
β ββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
- FastAPI - Modern Python web framework
- LangChain - LLM orchestration and RAG implementation
- OpenAI GPT-4 - SQL generation and analysis
- ChromaDB - Vector database for schema embeddings
- Redis - Fast in-memory short-term conversation cache
- Mem0 - Intelligent long-term memory with semantic search
- Qdrant - Vector database backend for Mem0
- SQLite - Sample retail database
- React 18 - Modern UI library
- TypeScript - Type-safe JavaScript
- CSS-in-JS - Styled components
- text-embedding-3-large - Schema embeddings (OpenAI)
- text-embedding-3-small - Memory embeddings (OpenAI)
- gpt-4 - SQL generation and analysis
- gpt-4o-mini - Memory extraction
- Python 3.11+
- Node.js 18+
- Redis (via Homebrew or Docker)
- OpenAI API Key
git clone https://github.com/yourusername/sql-query-buddy.git
cd sql-query-buddy# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
cat > .env << EOF
OPENAI_API_KEY=your_openai_api_key_here
EOF# Embed database schemas into ChromaDB
python backend/rag/embed_schema.pyExpected output:
π₯ Loading schema.sql...
πͺ Splitting schema into table chunks...
π Found 4 tables to embed.
β
Embedded table: customers
β
Embedded table: products
β
Embedded table: orders
β
Embedded table: order_items
π All schema embeddings stored successfully!
# Using Homebrew (Mac)
brew install redis
redis-server
# Using Docker
docker run -d -p 6379:6379 redis:latestuvicorn backend.main:app --reloadBackend runs on: http://localhost:8000
# Open new terminal
cd frontend
# Install dependencies
npm install
# Start development server
npm startFrontend runs on: http://localhost:3000
User: "Show me the top 5 customers by total purchase amount"
Response:
β
SQL: SELECT c.name, SUM(o.total_amount) as total...
β
Results: 5 rows
β
Explanation: "This query finds the top 5 customers..."
β
Insights: "Alice Chen is your top customer with $5,000..."
β
Optimization: "Consider adding an index on orders.customer_id..."
User: "Now filter them to California only"
Response:
β
SQL: WITH top_customers AS (SELECT...) WHERE region='California'
β
Understands "them" refers to previous top 5 customers
β
Uses Redis short-term + Mem0 long-term memory
User: "Which product category made the most revenue last month?"
Response:
β
Automatically retrieves: products, orders, order_items schemas
β
Generates proper JOIN query
β
Provides revenue breakdown and trends
POST /rag/query
Content-Type: application/json
{
"question": "Show me the top 5 customers",
"session_id": "default",
"user_id": "anonymous"
}Response:
{
"sql": "SELECT ...",
"results": [...],
"insights": "Key findings...",
"explanation": "This query...",
"optimization": "Performance tips...",
"execution_time_ms": 15.42,
"memory_context": {
"short_term": "Recent conversation...",
"long_term": "Relevant past context...",
"combined": "Full context..."
}
}GET /rag/memory/stats?session_id=default&user_id=anonymousResponse:
{
"redis": {
"recent_exchanges": 3,
"expires_in_seconds": 3421
},
"mem0": {
"total_memories": 5
},
"total": 8
}DELETE /rag/memory/redis/{session_id}DELETE /rag/memory/mem0/{user_id}GET /rag/memory/all/{user_id}The project includes a sample retail database with:
- customers - Customer information (id, name, email, region)
- products - Product catalog (id, name, category, price, stock)
- orders - Order records (id, customer_id, order_date, total_amount)
- order_items - Order line items (id, order_id, product_id, quantity, price)
- 10+ customers across different regions
- 15+ products in various categories
- 20+ orders with multiple line items
- Open Postman
- Click Import β Raw Text
- Paste this collection:
{
"info": {
"name": "SQL Query Buddy API",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json"
},
"item": [
{
"name": "Query - Basic",
"request": {
"method": "POST",
"header": [{"key": "Content-Type", "value": "application/json"}],
"body": {
"mode": "raw",
"raw": "{\n \"question\": \"Show me the top 5 customers by total purchase amount\",\n \"session_id\": \"test-session\",\n \"user_id\": \"test-user\"\n}"
},
"url": "http://localhost:8000/rag/query"
}
},
{
"name": "Query - Follow-up",
"request": {
"method": "POST",
"header": [{"key": "Content-Type", "value": "application/json"}],
"body": {
"mode": "raw",
"raw": "{\n \"question\": \"Now filter them to California only\",\n \"session_id\": \"test-session\",\n \"user_id\": \"test-user\"\n}"
},
"url": "http://localhost:8000/rag/query"
}
},
{
"name": "Memory Stats",
"request": {
"method": "GET",
"url": "http://localhost:8000/rag/memory/stats?session_id=test-session&user_id=test-user"
}
},
{
"name": "Get All Memories",
"request": {
"method": "GET",
"url": "http://localhost:8000/rag/memory/all/test-user"
}
},
{
"name": "Clear Redis Memory",
"request": {
"method": "DELETE",
"url": "http://localhost:8000/rag/memory/redis/test-session"
}
}
]
}- Test Basic Query - Run "Query - Basic"
- Check Memory - Run "Memory Stats" (should show 1 exchange)
- Test Context Memory - Run "Query - Follow-up"
- Verify Memory - Run "Memory Stats" (should show 2 exchanges)
- View Memories - Run "Get All Memories"
- Clean Up - Run "Clear Redis Memory"
Short-term (Redis):
- Stores last 10 conversation exchanges per session
- TTL: 1 hour
- Use case: Immediate follow-up questions
Long-term (Mem0/Qdrant):
- Semantic memory with automatic extraction
- Persistent storage with vector embeddings
- Use case: Historical patterns, user preferences
# Semantic search for relevant schemas
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
# Only retrieves top 3 most relevant tables
# Reduces token usage and improves accuracyUser Question
β Memory Retrieval (Redis + Mem0)
β Schema Retrieval (ChromaDB RAG)
β LLM Prompt Construction
β GPT-4 SQL Generation
β Query Execution
β Multi-step Analysis
β Memory Storage
- API Keys: Never commit
.envfiles - SQL Injection: Uses parameterized queries
- Input Validation: Pydantic models validate all inputs
- Rate Limiting: Redis-based rate limiting ready
- Error Handling: Comprehensive exception handling
- Support for multiple database types (PostgreSQL, MySQL)
- Query history visualization
- Export results to CSV/Excel
- Collaborative query sharing
- Advanced analytics dashboard
- Natural language to database schema generation
- Multi-tenant support
- API authentication (OAuth2/JWT)
sql-query-buddy/
βββ backend/
β βββ db/
β β βββ retail.db # SQLite database
β β βββ schema.sql # Database schema
β βββ rag/
β β βββ router.py # Main API endpoints
β β βββ redis_mem0_memory.py # Hybrid memory manager
β β βββ embed_schema.py # Schema embedding script
β β βββ vectorstore/ # ChromaDB storage
β βββ main.py # FastAPI app
βββ frontend/
β βββ src/
β β βββ RagQuery.tsx # Main UI component
β β βββ App.tsx # App entry point
β βββ package.json
βββ qdrant_storage/ # Mem0 vector database
βββ requirements.txt
βββ .env.example
βββ README.md
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI - GPT-4 and embedding models
- LangChain - RAG orchestration framework
- Redis - High-performance in-memory database
- Mem0 - Intelligent memory management
- ChromaDB - Vector database for embeddings
- FastAPI - Modern Python web framework
Your Name
- GitHub: https://github.com/SenayYakut
- LinkedIn: https://www.linkedin.com/in/senaykt/
- Email: senaykt@gmail.com
If you find this project helpful, please give it a star! β
Built with β€οΈ using AI, RAG, and Modern Web Technologies


