- CrewAI (v1.6.1+) - Orchestrates specialized AI agents for SQL generation, query optimization, schema analysis, and database interaction
- LangGraph (v1.0.6+) - State machine framework for building agentic workflows, used for database discovery and multi-step reasoning
- LangChain Community (v0.4.1+) - LLM integration and prompt management
- LangChain Ollama (v1.0.0+) - Local LLM integration with Ollama
- Ollama - Local LLM inference engine (supports Qwen 2.5, Llama, GPT-OSS, and more)
- Supported Models:
qwen2.5:7b- Fast & efficient (4.7GB)qwen2.5:14b- Balanced performance (9.0GB)gpt-oss:20b- Most capable (13GB)
- Kedro (v1.1.1) - Data pipeline framework for orchestrating ML workflows
- FastAPI (v0.123.4+) - High-performance Python API framework
- Uvicorn (v0.38.0+) - ASGI server for FastAPI
- Python-dotenv (v1.2.1+) - Environment configuration
- Python-multipart (v0.0.20+) - File upload support
- PyMySQL (v1.1.2+) - Pure Python MySQL client
- mysql-connector-python (v9.5.0+) - Official MySQL connector
- psycopg2 - PostgreSQL adapter (optional)
- sqlite3 - Built-in SQLite support
- Express.js (v4.19.2+) - Web application framework
- Mongoose (v8.3.2+) - MongoDB object modeling
- bcrypt (v5.1.1+) - Password hashing
- jsonwebtoken (v9.0.2+) - JWT authentication
- express-session (v1.18.0+) - Session management
- cors (v2.8.5+) - CORS middleware
- helmet (v7.1.0+) - Security headers
- morgan (v1.10.0+) - HTTP request logger
- React (v18.2.0) - UI framework
- Vite (v5.3.1+) - Build tool and dev server
- React Router DOM (v6.23.1+) - Client-side routing
- Axios (v1.7.2+) - HTTP client
- TailwindCSS (v3.4.4+) - Utility-first CSS framework
- DaisyUI (v4.12.2+) - Tailwind component library
- React Icons (v5.2.1+) - Icon library
- React Markdown (v9.0.1+) - Markdown rendering
- React Syntax Highlighter (v15.5.0+) - Code syntax highlighting
- Prism.js (v1.29.0+) - Syntax highlighting theme
- Nivo (v0.87.0+) - Data visualization library
- D3.js (v7.9.0+) - Data-driven visualizations
- Moment.js (v2.30.1+) - Date manipulation
- Socket.io Client (v4.7.5+) - Real-time bidirectional communication
- scikit-learn (v1.5.1+) - Machine learning library
- seaborn (v0.12.1+) - Statistical data visualization
- Jupyter Lab (v3.0+) - Interactive development environment
- Kedro-viz (v6.7.0+) - Pipeline visualization
- uv - Fast Python package installer
- nodemon (v3.1.2+) - Node.js auto-restart utility
- ESLint (v8.57.0+) - JavaScript linter
- Autoprefixer (v10.4.19+) - CSS vendor prefixing
- PostCSS (v8.4.38+) - CSS transformation tool
SQL BigBrother is a multi-agent AI-powered SQL optimization system built with Kedro 1.1.1. This system uses CrewAI agents orchestrating specialized roles for SQL query generation, optimization, and analysis, combined with LangGraph for stateful agentic workflows like database discovery. The platform leverages local Ollama models to provide enterprise-grade database intelligence entirely offline, without requiring external API keys or cloud dependencies.
- 🤖 Local AI Processing: Uses Ollama for local LLM inference (no external API keys required)
- � Database Discovery Agent: Automatic detection of PostgreSQL, MySQL, and SQLite databases using LangGraph agents
- �📊 Schema Upload & Analysis: Upload SQL schema files and get AI-generated insights
- 💬 Interactive Chat Interface: Ask questions about your database schema
- 🔍 SQL Query Generation: Generate optimized SQL queries based on natural language requests
- 📈 Query Analysis: Get explanations and optimizations for existing SQL queries
- 🔐 Authentication System: Secure user authentication and chat history
sql-bigbrother/
├── src/sql_bigbrother/
│ ├── core/ # Core application components
│ │ ├── api/ # FastAPI backend services
│ │ ├── auth/ # Node.js authentication service
│ │ ├── backup/ # Backup utilities
│ │ ├── frontend/ # React frontend application
│ │ └── https/ # HTTP test files
│ ├── pipelines/
│ │ └── sql_processing/ # Kedro SQL processing pipeline
│ └── api/ # FastAPI integration (symlink to core/api)
├── conf/ # Kedro configuration files
├── data/ # Data storage
└── run_server.py # FastAPI server startup script
Note: Download the full video here for better quality.
- Install Ollama: Download and install Ollama
- Pull AI Models: Install required models
# Install recommended models (choose based on your system resources) ollama pull qwen2.5:7b # Fastest, 4.7GB ollama pull qwen2.5:14b # Balanced, 9.0GB ollama pull gpt-oss:20b # Most capable, 13GB
- Start Ollama Service:
ollama serve
-
Setup Environment: Run the setup script to install dependencies:
chmod +x setup.sh ./setup.sh
-
Start Required Services:
# Start Ollama service (required for AI) ollama serve & # Start MySQL service (required for database functionality) brew services start mysql # macOS # sudo systemctl start mysql # Linux # Option 1: Use the startup script chmod +x start_all.sh ./start_all.sh # Option 2: Start services manually # Terminal 1: FastAPI Server with proper environment OPENAI_API_KEY="sk-dummy-key-for-ollama-usage" \ CREWAI_LLM_PROVIDER="ollama" \ OLLAMA_BASE_URL="http://localhost:11434" \ uv run python run_server.py # Terminal 2: Authentication Server cd src/sql_bigbrother/core/auth && node server.js # Terminal 3: Frontend cd src/sql_bigbrother/core/frontend && npm run dev
-
Access the application:
- Frontend: http://localhost:5176 (or check terminal for actual port)
- API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Auth Service: http://localhost:2405
When the server starts, it automatically:
- Discovers all local databases: PostgreSQL, MySQL, and SQLite installations
- Auto-initializes chat: If SQLite databases are found, automatically extracts and loads the first one's schema
- Generates welcome introduction: An AI agent creates a personalized welcome message explaining the loaded database
The discovered databases and schema are immediately available for querying—no manual upload required!
curl http://localhost:8000/health
# Response:
{
"status": "healthy",
"service": "SQL BigBrother",
"databases_discovered": 3,
"chat_initialized": true
}# Get the auto-generated welcome message and loaded schema
curl http://localhost:8000/chat/init
# Expected Response:
{
"title": "Cinema Database",
"introduction": "Welcome! 👋 I've automatically loaded the Cinema Database schema for you...",
"recommends": ["Question 1", "Question 2", "Question 3", "Question 4"],
"sql_content": "CREATE TABLE...",
"auto_initialized": true,
"discovered_databases": {
"databases": [...],
"summary": "..."
}
}curl http://localhost:8000/databases
# Expected Response:
{
"databases": [
{
"type": "postgresql",
"status": "available",
"output": "psql (PostgreSQL) 15.3"
},
{
"type": "mysql",
"status": "available",
"output": "mysql Ver 8.0.33"
}
],
"os_type": "darwin",
"commands_executed": ["psql --version: SUCCESS", ...],
"summary": "{'total_found': 2, 'databases_by_type': {'postgresql': 1, 'mysql': 1}}",
"discovery_timestamp": "2026-01-14T10:30:00"
}curl -X POST http://localhost:8000/databases/rediscover
# Triggers a fresh database discovery scancurl -X 'POST' \
'http://localhost:8000/ask-chat' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'question=Get all users with their email addresses&schema=CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(255), email VARCHAR(255));&model=qwen2.5:7b'
# Expected Response:
{
"query": "SELECT id, name, email FROM users;",
"explain": "",
"rows": [...],
"columns": [...],
"available_databases": [...] # Includes discovered databases
}curl -X 'POST' \
'http://localhost:8000/init-chat' \
-H 'accept: application/json' \
-F 'file=@path/to/schema.sql'
# Expected Response:
{
"title": "Generated Schema Title",
"recommends": ["Recommended question 1", "Recommended question 2"],
"sql_content": "Original schema content",
"discovered_databases": {
"count": 3,
"databases": [...],
"summary": "Database discovery summary"
}
}# Automatically extract and process schema from a discovered database
curl -X 'POST' \
'http://localhost:8000/auto-schema' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'db_index=0'
# Expected Response:
{
"title": "Auto-Generated Schema Title",
"recommends": ["Recommended question 1", "Recommended question 2"],
"sql_content": "Extracted schema content",
"auto_generated": true,
"source_database": {
"type": "sqlite",
"path": "/path/to/database.db"
}
}# For SQLite
curl -X 'POST' \
'http://localhost:8000/extract-schema' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'db_type=sqlite&path=/path/to/database.db'
# For PostgreSQL
curl -X 'POST' \
'http://localhost:8000/extract-schema' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'db_type=postgresql&host=localhost&port=5432&database=mydb&username=user&password=pass'
# For MySQL
curl -X 'POST' \
'http://localhost:8000/extract-schema' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'db_type=mysql&host=localhost&port=3306&database=mydb&username=user&password=pass'- Navigate to the frontend application at http://localhost:5176
- Go to the Chat page
- Click "Discover Databases" to see all locally discovered databases
- Select a database and click "Auto-Generate Schema"
- The system will automatically extract and process the schema
- Navigate to the frontend application at http://localhost:5176
- Go to the Chat page and click the Schema tab
- Click "Click here to upload schema"
- Upload a SQL schema file (samples available in
data/01_raw/sample_schemas/)
- Once schema is uploaded or auto-generated, you'll get AI-generated title and recommended questions
- Ask natural language questions about your database
- Request SQL queries based on your requirements
- Get optimized SQL queries with explanations
- View query execution details and performance insights
- Save chat conversations for future reference
The system automatically detects available models. Check what's installed:
ollama listSupported model formats:
qwen2.5:7b(recommended for development)qwen2.5:14b(balanced performance/quality)qwen3:14b(newer version)qwen3:30b(high quality, resource intensive)gpt-oss:20b(alternative high-quality model)
Before running the application, you need to configure Firebase for authentication and data storage:
-
Create a Firebase project at Firebase Console
-
Copy the example configuration file:
cd src/sql_bigbrother/core/frontend/src/firebase cp config.example.js config.js -
Update Firebase configuration in:
src/sql_bigbrother/core/frontend/src/firebase/config.jsReplace the configuration with your Firebase project credentials:
const firebaseConfig = { apiKey: "YOUR_API_KEY", authDomain: "YOUR_AUTH_DOMAIN", projectId: "YOUR_PROJECT_ID", storageBucket: "YOUR_STORAGE_BUCKET", messagingSenderId: "YOUR_MESSAGING_SENDER_ID", appId: "YOUR_APP_ID" };
-
Update environment variables (optional - if using
.envfile):# Create .env file in frontend directory cd src/sql_bigbrother/core/frontend # Add Firebase configuration VITE_FIREBASE_API_KEY=your_api_key VITE_FIREBASE_AUTH_DOMAIN=your_auth_domain VITE_FIREBASE_PROJECT_ID=your_project_id VITE_FIREBASE_STORAGE_BUCKET=your_storage_bucket VITE_FIREBASE_MESSAGING_SENDER_ID=your_messaging_sender_id VITE_FIREBASE_APP_ID=your_app_id
Note: The
config.jsfile is gitignored to prevent accidental credential commits. Always useconfig.example.jsas a template.
The system automatically detects available Ollama models and uses the best one available:
- qwen2.5:7b: Fast, suitable for most tasks
- qwen2.5:14b: Better quality, more resource intensive
- gpt-oss:20b: Highest quality, requires significant resources
If you prefer using external AI services, you can set environment variables:
export GROQ_API_KEY="your_groq_api_key"
export OPENAI_API_KEY="your_openai_api_key" - Don't remove any lines from the
.gitignorefile we provide - Make sure your results can be reproduced by following a data engineering convention
- Don't commit data to your repository
- Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in
conf/local/
┌─────────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │Dashboard │ │ Queries │ │ Agents │ │ Connections │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │
└───────┼─────────────┼─────────────┼────────────────┼───────────┘
│ │ │ │
└─────────────┴─────────────┴────────────────┘
│
┌───────────▼────────────┐
│ FastAPI Backend │
│ (Python Endpoints) │
└───────────┬────────────┘
│
┌───────────────────┼────────────────────┐
│ │ │
┌───────▼────────┐ ┌───────▼────────┐ ┌───────▼────────┐
│ LLM Service │ │ Agent Manager │ │ DB Connections │
│ (Ollama/Local) │ │ (CrewAI) │ │ Service │
└───────┬────────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
│ ┌───────▼────────┐ │
│ │ Agent Executor │ │
│ └───────┬────────┘ │
│ │ │
└───────────────────┼────────────────────┘
│
┌───────────▼────────────┐
│ Database Layer │
│ (PostgreSQL/MySQL) │
└────────────────────────┘
- FastAPI Server (port 8000): Main API server with Kedro pipeline integration
- Authentication Service (port 2405): Node.js server handling user auth and chat history
- Ollama Service (port 11434): Local AI model inference
- React Application: Modern web interface built with Vite and Tailwind CSS
- Real-time Chat: Interactive chat interface for SQL queries and schema analysis
- CrewAI Agents: Specialized AI agents for different tasks (SQL generation, analysis, recommendations)
- Local Processing: All AI processing happens locally via Ollama
- Schema Analysis: Automatic schema parsing and intelligent question generation
┌─────────────────────────────────────────────────────────────┐
│ Agent System │
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Agent Executor (Core) │ │
│ │ • Manages agent lifecycle │ │
│ │ • Routes queries to appropriate agents │ │
│ │ • Handles agent communication │ │
│ └─────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌────────────┼────────────┬────────────┐ │
│ │ │ │ │ │
│ ┌────▼─────┐ ┌───▼──────┐ ┌──▼───────┐ ┌─▼─────────┐ │
│ │ SQL │ │ Analysis │ │ Security │ │ Database │ │
│ │ Agent │ │ Agent │ │ Agent │ │ Discovery │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─┬─────────┘ │
│ │ │ │ │ │
│ └────────────┼────────────┴──────────┘ │
│ │ │
│ ┌────────────▼────────────┐ │
│ │ LLM Integration │ │
│ │ (Ollama - Local) │ │
│ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Located in: src/sql_bigbrother/core/api/
Responsibilities:
- Agent Lifecycle Management: Creates, initializes, and manages agent instances
- Query Routing: Determines which agent should handle specific queries
- Context Management: Maintains conversation history and context
- Tool Integration: Provides access to database tools and utilities
- Error Handling: Manages failures and fallbacks between agents
Key Capabilities:
execute_query()- Main entry point for processing queriesroute_to_agent()- Intelligent routing based on query typemaintain_history()- Keeps track of conversation context
Located in: src/sql_bigbrother/pipelines/sql_processing/nodes.py
Responsibilities:
- Automatic Database Detection: Discovers PostgreSQL, MySQL, and SQLite installations
- System Command Execution: Runs OS-specific commands to identify database services
- Database File Scanning: Locates SQLite database files on the local filesystem
- Connection Validation: Tests database availability and accessibility
- Multi-OS Support: Works across macOS, Linux, and Windows platforms
Key Capabilities:
check_os()- Detects operating system typediscover_postgres()- Finds PostgreSQL installations and running instancesdiscover_mysql()- Identifies MySQL/MariaDB servicesdiscover_sqlite()- Locates SQLite database filessummarize_discovery()- Provides comprehensive discovery report
LangGraph Workflow:
Start → Check OS → Discover PostgreSQL → Discover MySQL
→ Discover SQLite → Summarize → End
Located in: src/sql_bigbrother/core/api/prompts/
Responsibilities:
- Query Generation: Converts natural language to SQL queries
- Query Validation: Ensures SQL syntax is correct
- Query Optimization: Suggests improvements for performance
- Schema Understanding: Analyzes database schema to generate accurate queries
- Multi-Database Support: Handles different SQL dialects (PostgreSQL, MySQL, etc.)
Key Capabilities:
generate_sql()- Creates SQL from natural languagevalidate_query()- Checks SQL correctnessexplain_query()- Provides query explanations
Located in: src/sql_bigbrother/core/api/prompts/
Responsibilities:
- Data Interpretation: Analyzes query results and provides insights
- Trend Detection: Identifies patterns and anomalies in data
- Visualization Suggestions: Recommends appropriate charts/graphs
- Report Generation: Creates summaries and reports from data
- Business Intelligence: Translates data into actionable insights
Key Capabilities:
analyze_results()- Interprets query outputgenerate_insights()- Provides business insightssuggest_visualizations()- Recommends data viz approaches
Located in: src/sql_bigbrother/core/api/services/
Responsibilities:
- SQL Injection Prevention: Detects and blocks malicious queries
- Access Control: Enforces user permissions and roles
- Audit Logging: Tracks all query executions and modifications
- Query Sanitization: Cleans and validates input queries
- Compliance Checking: Ensures queries meet security standards
Key Capabilities:
validate_security()- Checks for security threatsenforce_permissions()- Validates user access rightsaudit_query()- Logs query execution for compliance
User Query
│
▼
Frontend (React)
│
▼
API Endpoint (/api/query)
│
▼
Agent Executor
│
├─► Security Agent (Validates query safety)
│ │
│ ▼
├─► SQL Agent (Generates SQL)
│ │
│ ▼
├─► Database Execution
│ │
│ ▼
└─► Analysis Agent (Interprets results)
│
▼
Response to User
- Backend: FastAPI (Python)
- Frontend: React + Vite
- AI/LLM: Ollama (Local), OpenAI GPT / Claude (optional)
- Database: PostgreSQL (primary), MySQL, SQLite (supported)
- Agent Framework: CrewAI with custom implementations
- Python 3.11+ with
uvpackage manager - Node.js 18+ with
npm - Ollama installed and running
- 8GB+ RAM for basic models (16GB+ recommended)
-
Install System Dependencies:
# Install uv (if not already installed) curl -LsSf https://astral.sh/uv/install.sh | sh # Install Node.js (if not already installed) # macOS: brew install node # Ubuntu: sudo apt install nodejs npm # Install Ollama # Visit https://ollama.ai for installation instructions
-
Install Project Dependencies:
# Install Python dependencies uv sync # Install Node.js dependencies for auth service cd src/sql_bigbrother/core/auth && npm install # Install Node.js dependencies for frontend cd src/sql_bigbrother/core/frontend && npm install # Return to project root cd ../../../..
-
Install AI Models:
# Start Ollama service ollama serve & # Install models (choose based on your system) ollama pull qwen2.5:7b # 4.7GB - Fast, recommended for development ollama pull qwen2.5:14b # 9.0GB - Better quality ollama pull gpt-oss:20b # 13GB - Highest quality (optional)
-
Verify Installation:
# Test Ollama curl http://localhost:11434/api/tags # Test model ollama run qwen2.5:7b "Hello, how are you?" # Test Python environment uv run python -c "import crewai; print('CrewAI installed successfully')"
# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc # or restart terminal# macOS
brew install node
# Ubuntu/Debian
sudo apt update && sudo apt install nodejs npm
# Verify installation
node --version && npm --version# Install Ollama - visit https://ollama.ai
# Or use curl (Linux/macOS):
curl -fsSL https://ollama.ai/install.sh | sh
# Verify installation
ollama --version# Ensure you're in the right environment
uv sync --dev
# Install specific missing packages
uv add crewai langchain-ollama fastapi uvicorn
# For LiteLLM support (if needed)
uv add litellm# Clear npm cache
npm cache clean --force
# Remove node_modules and reinstall
rm -rf node_modules package-lock.json
npm install
# For permission issues (Linux/macOS)
sudo npm install -g npmCreate a .env file in the project root (optional but recommended):
# Database Configuration (for MySQL integration)
DB_USER=root
DB_PASSWORD=
DB_NAME_SETUP=sql_bigbrother
DB_NAME_USE=sql_bigbrother
# Optional: External AI APIs (if you don't want to use Ollama)
OPENAI_API_KEY=your_openai_key
GROQ_API_KEY=your_groq_key
GROQ_API_BASE=your_groq_base_url
GROQ_MODEL_NAME=your_groq_model
# Server configuration (optional)
FASTAPI_HOST=0.0.0.0
FASTAPI_PORT=8000
AUTH_PORT=2405
FRONTEND_PORT=5176
# CrewAI Configuration (for local Ollama)
CREWAI_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434If you prefer Docker:
# Build and run with Docker Compose
docker-compose up --build
# Access services
# Frontend: http://localhost:5176
# API: http://localhost:8000
# Auth: http://localhost:2405You can run your Kedro project with:
kedro run
Have a look at the files tests/test_run.py and tests/pipelines/data_science/test_pipeline.py for instructions on how to write your tests. Run the tests as follows:
pytest
You can configure the coverage threshold in your project's pyproject.toml file under the [tool.coverage.report] section.
To see and update the dependency requirements for your project use requirements.txt. You can install the project requirements with pip install -r requirements.txt.
Further information about project dependencies
Note: Using
kedro jupyterorkedro ipythonto run your notebook provides these variables in scope:catalog,context,pipelinesandsession.Jupyter, JupyterLab, and IPython are already included in the project requirements by default, so once you have run
pip install -r requirements.txtyou will not need to take any extra steps before you use them.
To use Jupyter notebooks in your Kedro project, you need to install Jupyter:
pip install jupyter
After installing Jupyter, you can start a local notebook server:
kedro jupyter notebook
To use JupyterLab, you need to install it:
pip install jupyterlab
You can also start JupyterLab:
kedro jupyter lab
And if you want to run an IPython session:
kedro ipython
To automatically strip out all output cell contents before committing to git, you can use tools like nbstripout. For example, you can add a hook in .git/config with nbstripout --install. This will run nbstripout before anything is committed to git.
Note: Your output cells will be retained locally.
Root Cause: Model name format mismatch and CrewAI configuration issues.
Solutions:
-
Fix Model Name Format:
# ❌ Wrong format model=qwen2_5_7b # ✅ Correct format model=qwen2.5:7b
-
Use Valid SQL Schema:
# ❌ Invalid schema schema=string # ✅ Valid schema schema=CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(255));
-
Correct curl Command:
curl -X 'POST' \ 'http://localhost:8000/ask-chat' \ -H 'accept: application/json' \ -H 'Content-Type: application/x-www-form-urlencoded' \ -d 'question=Hello, write a SQL query to get all users&schema=CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(255));&model=qwen2.5:7b'
Root Cause: CrewAI defaulting to OpenAI instead of using Ollama.
Solution: Ensure proper environment variables are set when starting the server:
# Set environment variables for Ollama
export OPENAI_API_KEY="sk-dummy-key-for-ollama-usage"
export CREWAI_LLM_PROVIDER="ollama"
export OLLAMA_BASE_URL="http://localhost:11434"
# Start server with environment
uv run python run_server.pyRoot Cause: CrewAI version compatibility issue with TaskOutput structure.
Solution: This is automatically fixed in the current codebase. If you encounter this:
- Update to the latest version of the code
- The fix changes
task.output.raw_outputtotask.output.raw
Root Cause: Database connection failure (MySQL not running).
Solutions:
-
Install and Start MySQL (if you need database execution):
# macOS with Homebrew brew install mysql brew services start mysql # Ubuntu/Debian sudo apt install mysql-server sudo systemctl start mysql
-
Use SQLite (recommended for development):
- Modify database configuration to use SQLite instead
- No separate database server required
-
Skip Database Execution (for query generation only):
- The API will still generate SQL queries
- Database execution errors don't prevent query generation
-
"Ollama service not running"
# Check if Ollama is running curl http://localhost:11434/api/tags # Start Ollama service ollama serve
-
"No models available"
# List installed models ollama list # Install required models ollama pull qwen2.5:7b ollama pull qwen2.5:14b # Optional, better quality
-
"Port already in use" The
run_server.pyscript automatically detects and handles port conflicts:- Option 1: Automatically kill existing processes (recommended)
- Option 2: Use an alternative port (8001, 8002, etc.)
- Option 3: Exit and handle manually
Manual port cleanup:
# Kill processes on specific ports lsof -ti:8000 | xargs kill -9 # FastAPI lsof -ti:2405 | xargs kill -9 # Auth service lsof -ti:5176 | xargs kill -9 # Frontend lsof -ti:11434 | xargs kill -9 # Ollama
-
"Schema upload not working"
- Ensure Ollama service is running:
ollama serve - Check models are installed:
ollama list - Verify FastAPI server is running:
curl http://localhost:8000/health - Check server logs for detailed errors
- Ensure Ollama service is running:
-
"Frontend not accessible"
- Check the actual port in terminal output (may be 5176 instead of 5173)
- Ensure Node.js dependencies are installed:
cd frontend && npm install - Clear browser cache and try again
-
"CrewAI agents not responding"
- Verify Ollama models are working:
curl -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d '{"model": "qwen2.5:7b", "prompt": "Hello", "stream": false}' - Check environment variables are set correctly
- Restart services in correct order: Ollama → FastAPI → Frontend
- Verify Ollama models are working:
-
"Database connection errors"
- MySQL not running:
brew services start mysql(macOS) orsudo systemctl start mysql(Linux) - Database doesn't exist:
mysql -u root -e "CREATE DATABASE sql_bigbrother;" - Connection denied: Check DB_USER and DB_PASSWORD in
.envfile - Port conflicts: Default MySQL port is 3306, ensure it's not blocked
- MySQL not running:
-
Memory Requirements:
qwen2.5:7b: ~8GB RAM minimum (recommended for development)qwen2.5:14b: ~16GB RAM minimumgpt-oss:20b: ~24GB RAM minimum
-
Model Selection:
- Fast responses: Use
qwen2.5:7b - Balanced: Use
qwen2.5:14b - Best quality: Use
gpt-oss:20borqwen3:14b
- Fast responses: Use
-
System Resources:
- Monitor CPU usage during model loading
- SSD recommended for faster model loading
- Close other resource-intensive applications
-
Check Service Status:
# Ollama curl http://localhost:11434/api/tags # FastAPI curl http://localhost:8000/health # Frontend (if running) curl http://localhost:5176
-
Test Individual Components:
# Test Ollama directly ollama run qwen2.5:7b "Hello" # Test API endpoint curl -X GET http://localhost:8000/health # Test with simple query curl -X POST http://localhost:8000/ask-chat \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "question=Get all users&schema=CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(255));&model=qwen2.5:7b" # Test database connection mysql -u root -e "USE sql_bigbrother; SHOW TABLES;"
-
Check Logs:
- FastAPI logs: Terminal running
run_server.py - Ollama logs: Terminal running
ollama serve - System logs:
tail -f /var/log/system.log(macOS) orjournalctl -f(Linux)
- FastAPI logs: Terminal running
If you encounter issues not covered here:
-
Check Dependencies:
# Python environment uv run python --version uv run pip list | grep -E "(crewai|langchain|fastapi)" # Node.js environment node --version npm --version
-
Enable Verbose Logging:
- Add
verbose=Trueto Crew configurations - Set
DEBUG=Truein FastAPI settings - Check browser developer console for frontend issues
- Add
-
Reset Environment:
# Clean restart pkill -f "ollama\|uvicorn\|node" # Restart services ollama serve & sleep 5 uv run python run_server.py
macOS Installation:
# Install MySQL using Homebrew
brew install mysql
# Start MySQL service
brew services start mysql
# Create database for SQL BigBrother
mysql -u root -e "CREATE DATABASE IF NOT EXISTS sql_bigbrother;"
# Test connection
mysql -u root -e "USE sql_bigbrother; SELECT 'MySQL setup complete' as status;"Ubuntu/Linux Installation:
# Install MySQL
sudo apt update && sudo apt install mysql-server
# Start MySQL service
sudo systemctl start mysql
sudo systemctl enable mysql
# Secure installation (set root password)
sudo mysql_secure_installation
# Create database
sudo mysql -u root -p -e "CREATE DATABASE sql_bigbrother;"Configuration:
After installation, update your .env file:
DB_USER=root
DB_PASSWORD= # Leave empty for default macOS setup
DB_NAME_SETUP=sql_bigbrother
DB_NAME_USE=sql_bigbrotherMySQL Service Management:
# Start/Stop MySQL (macOS)
brew services start mysql
brew services stop mysql
# Start/Stop MySQL (Linux)
sudo systemctl start mysql
sudo systemctl stop mysql
# Check status
brew services list | grep mysql # macOS
sudo systemctl status mysql # LinuxSQLite requires no setup - it creates database files automatically. Update your configuration to use SQLite for easier development.
-
Database Connection Setup Required
- Issue: Full functionality requires MySQL or SQLite configuration
- Impact: Without database, only SQL query generation works (no query execution or results)
- Solution: Install and configure MySQL (see installation section above)
- Alternative: Use the system for SQL generation only, execute queries manually in your preferred database tool
-
Memory Usage with Large Models
- Issue: Models like
gpt-oss:20brequire significant RAM (20GB+) - Impact: System may become unresponsive on lower-end machines
- Workaround: Use smaller models (
qwen2.5:7brequires only ~8GB RAM) - Future Fix: Model optimization and memory management improvements
- Issue: Models like
-
CrewAI Version Compatibility
- Issue: Some CrewAI versions don't properly handle custom LLM providers
- Impact: May default to OpenAI API instead of Ollama
- Solution: Current codebase uses CrewAI 1.6.1 with proper Ollama configuration
- Future Fix: Regular updates to track CrewAI improvements
After installation, verify everything is working:
# 1. Check all services are running
curl http://localhost:11434/api/tags # Ollama
curl http://localhost:8000/health # FastAPI
mysql -u root -e "SELECT 1;" # MySQL
# 2. Test complete workflow
curl -X 'POST' \
'http://localhost:8000/ask-chat' \
-H 'accept: application/json' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'question=Get all users with their details&schema=CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(255), email VARCHAR(255), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);&model=qwen2.5:7b'
# Expected response should include generated SQL query and execution results- Enhanced Database Support: SQLite integration for zero-config setup
- Query Optimization: Advanced SQL query analysis and optimization suggestions
- Model Management: Automatic model downloading and management
- Performance Monitoring: Real-time performance metrics and optimization tips
- Multi-Database Support: Support for PostgreSQL, SQL Server, Oracle
- Export Features: Export chat history and generated queries
- Query Execution History: Track and analyze query performance over time
This project is open for contributions. Common areas that need help:
- Database Integrations: Adding support for more database types
- Performance Optimization: Improving response times and memory usage
- UI/UX Improvements: Enhancing the frontend experience
- Testing: Adding comprehensive test coverage
- Documentation: Improving setup guides and troubleshooting
- v0.1.0: Initial release with OpenAI/Groq integration
- v0.2.0: Migration to local Ollama models for privacy and offline operation
- v0.2.1: CrewAI compatibility fixes and improved error handling
- Current: Enhanced troubleshooting and documentation
For issues not covered in this documentation:
- Check the Issues: Look at existing GitHub issues for similar problems
- Enable Debug Mode: Add verbose logging to get more detailed error information
- System Information: Include your OS, Python version, and model information when reporting issues
- Minimal Reproduction: Provide the minimal steps to reproduce the issue
- Sample Schemas: Use files in
data/01_raw/sample_schemas/for testing - Example Queries: Check the frontend for recommended questions after schema upload
- API Documentation: Visit
http://localhost:8000/docswhen server is running
Further information about building project documentation and packaging your project





