Skip to content

hexria/GhidraInsight

Repository files navigation

GhidraInsight 🔍

AI-Powered Binary Analysis Platform
Enterprise-grade reverse engineering with Ghidra, AI-driven insights, and seamless LLM integration.

License: Apache 2.0 Python 3.9+ Java 11+ Docker Ready Status: Production


⚡ Quick Start (60 seconds)

Option 1: Docker (Recommended)

git clone https://github.com/hexria/GhidraInsight.git
cd GhidraInsight
docker-compose up -d
open http://localhost:3000  # Dashboard opens automatically

Option 2: Automated Local Setup

chmod +x scripts/setup.sh
./scripts/setup.sh --mode=all

Option 3: Python-Only (Lightweight)

pip install ghidrainsight
ghidrainsight analyze --file binary.elf --ai-powered

🚀 Key Features

🔬 Advanced Binary Analysis

  • Automated Threat Detection: Cryptocurrency algorithms, vulnerable patterns, malicious code
  • Taint Analysis: Complete data flow tracking from source to sink
  • Control Flow Analysis: Anomaly detection and complexity metrics
  • Symbol Recovery: Function name inference and type reconstruction

🤖 AI-Powered Analysis

  • Multi-LLM Support: Claude, GPT-4, Gemini, and more - choose the best model for your needs
  • Context Optimization: Automatic context truncation for cheaper inference costs
  • Function Name Generation: AI-powered function name generation from disassembly/pseudocode
  • Automatic Comments: Intelligent comment generation for better code understanding
  • Vulnerability Explanations: Natural language explanations of detected vulnerabilities
  • Automated Vulnerability Scanning: CVSS scores with AI-powered remediation
  • Pattern Recognition: ML-based anomaly and weakness detection
  • Intelligent Code Summarization: Automatic function and module descriptions

🌐 Multiple Access Methods

  • Web Dashboard: Intuitive React UI with real-time analysis
  • Python SDK: Programmatic access with async support
  • CLI Tools: Command-line interface for automation
  • MCP Protocol: Seamless LLM integration (Claude, GPT-4, Gemini, and more)
  • REST API: RESTful endpoints for custom integrations
  • 🦙 Local AI: Ollama and other local models support (NEW)

🏗️ Enterprise Architecture

  • Modular Design: Plug-and-play analysis modules
  • Multi-Transport: HTTP, WebSocket, Server-Sent Events
  • Scalable: Horizontal scaling with Docker orchestration
  • Secure: JWT/OAuth authentication, rate limiting, audit logs
  • Observable: Comprehensive logging and tracing

📊 Access Method Comparison

Use Case Recommended Setup Time Learn Curve
Interactive analysis + visualization Web Dashboard 1 min Easy
CI/CD pipeline integration Python SDK or CLI 5 min Medium
LLM assistant integration MCP Protocol 10 min Medium
Custom automation scripts Python SDK 5 min Medium
Quick one-off analysis Docker + CLI 2 min Easy

📋 System Requirements

Minimum Requirements

Component Minimum Recommended
RAM 4GB 8GB+
CPU 2 cores 4+ cores
Disk 5GB 20GB+
Java 11 17 LTS
Python 3.9 3.11+
Node.js 18 20 LTS

Deployment Options

Option A: Docker (Recommended for Beginners)

✓ Single command setup
✓ No dependency conflicts
✓ Works on Windows, macOS, Linux
✓ Production-ready out of box
Requires: Docker Desktop (download)

Option B: Manual (For Customization)

✓ Full control over components
✓ Easier debugging
✓ Smaller resource footprint
Requires: Java 11+, Python 3.9+, Node.js 18+, Ghidra 11+

Option C: Python-Only (Lightweight)

✓ No GUI needed
✓ Perfect for servers
✓ Fastest setup
⚠ Requires external Ghidra installation

🛠 Installation Guide

Method 1️⃣: Docker (Recommended)

Prerequisites: Docker Desktop

# 1. Clone and navigate
git clone https://github.com/hexria/GhidraInsight.git
cd GhidraInsight

# 2. Start all services (one command!)
docker-compose up -d

# 3. Wait for services to start (~30 seconds)
docker-compose logs -f

# 4. Access the platform
echo "✅ Dashboard: http://localhost:3000"
echo "✅ API Server: http://localhost:8000"
echo "✅ WebSocket: ws://localhost:8001"

Stop services:

docker-compose down

View logs:

docker-compose logs -f ghidra-plugin
docker-compose logs -f python-mcp
docker-compose logs -f web-dashboard

Troubleshooting:

# Check service status
docker-compose ps

# Rebuild images
docker-compose build --no-cache

# Remove all containers and start fresh
docker-compose down -v && docker-compose up -d

Method 2️⃣: Automated Local Setup (macOS/Linux)

# Make script executable
chmod +x scripts/setup.sh

# Install everything with one command
./scripts/setup.sh --mode=all --python-version=3.11

# Start the platform
./scripts/startup.sh

For specific components only:

./scripts/setup.sh --mode=python-only  # Python MCP server only
./scripts/setup.sh --mode=java         # Java plugin only
./scripts/setup.sh --mode=dashboard    # Dashboard only

Verify installation:

ghidrainsight --version
ghidrainsight status

Method 3️⃣: Manual Installation

Step 1: Java Ghidra Plugin

cd ghidra-plugin
./gradlew build

# Install to Ghidra
cp build/libs/*.jar $GHIDRA_INSTALL_DIR/Extensions/Ghidra/plugins/

# Restart Ghidra and enable plugin in: Window → Plugin Manager

Step 2: Python MCP Server

cd python-mcp
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

pip install -e .
ghidrainsight-server --host 0.0.0.0 --port 8000

Step 3: Web Dashboard

cd web-dashboard
npm install
npm run dev  # Opens at http://localhost:5173

Method 4️⃣: Python Package Only (PyPI)

# Install from PyPI (once released)
pip install ghidrainsight

# Verify installation
ghidrainsight --version

# Start server
ghidrainsight serve --port 8000

🎯 Usage Guide

1. Web Dashboard (Easiest)

Perfect for: Interactive analysis, visualization, learning

Access at http://localhost:3000 after docker-compose up

Usage Steps:

  1. Drop a Binary: Drag file into the upload area or click "Select File"
  2. Auto-Analysis: System automatically:
    • Detects crypto algorithms
    • Finds vulnerabilities
    • Performs taint analysis
  3. Explore: Click on functions to see decompilation
  4. Ask AI: Use chat interface for natural language queries
  5. Export: Download JSON/PDF reports

Example AI Questions:

"What does function_0x401000 do?"
"Find all crypto operations"
"Show potential vulnerabilities"
"Analyze data flow from user input"
"Compare this with known malware patterns"

2. Python SDK (For Automation)

Perfect for: CI/CD integration, batch processing, custom workflows

from ghidrainsight.client import GhidraInsightClient
import asyncio

async def analyze_binary():
    # Connect to server
    client = GhidraInsightClient("http://localhost:8000")
    
    # Analyze with all features
    results = await client.analyze(
        file_path="/path/to/binary.elf",
        features=["crypto", "taint", "vulnerabilities"],
        ai_powered=True  # Enable AI analysis
    )
    
    # Access results
    print(f"Vulnerabilities: {results.vulnerabilities}")
    print(f"Crypto: {results.crypto_algos}")
    print(f"AI Insights: {results.ai_summary}")
    
    # Export report
    await client.export_report(results, format="pdf")

asyncio.run(analyze_binary())

3. CLI Tools (For Quick Analysis)

Perfect for: One-off analysis, scripting, integration

# Analyze a binary
ghidrainsight analyze --file binary.elf

# Show all crypto algorithms
ghidrainsight analyze --file binary.elf --features crypto --verbose

# Taint analysis from specific source
ghidrainsight taint --file binary.elf --source user_input --sink system_call

# With AI insights
ghidrainsight analyze --file binary.elf --ai-summary --output report.json

# Check server status
ghidrainsight status

# View configuration
ghidrainsight config list

4. LLM Integration (Multi-Provider Support)

Perfect for: AI assistants, automated security reviews

GhidraInsight supports multiple LLM providers with automatic context optimization:

Claude (Anthropic)

# Setup Claude integration
export ANTHROPIC_API_KEY=your-key-here
ghidrainsight integrate --provider anthropic --api-key $ANTHROPIC_API_KEY

# Use Claude for analysis
ghidrainsight analyze --file binary.elf --ai-provider anthropic --ai-model claude-3-haiku

OpenAI (GPT-4, GPT-3.5)

# Setup OpenAI integration
export OPENAI_API_KEY=your-key-here
ghidrainsight integrate --provider openai --api-key $OPENAI_API_KEY

# Use GPT-4 for analysis
ghidrainsight analyze --file binary.elf --ai-provider openai --ai-model gpt-4

Google Gemini

# Setup Google Gemini integration
export GOOGLE_API_KEY=your-key-here
ghidrainsight integrate --provider google --api-key $GOOGLE_API_KEY

# Use Gemini for analysis
ghidrainsight analyze --file binary.elf --ai-provider google --ai-model gemini-pro

Context Optimization: Automatically enabled to reduce token usage and costs. Long contexts are intelligently truncated while preserving key information.

See examples/CLAUDE_INTEGRATION.md, examples/OPENAI_INTEGRATION.md, and docs/AI_INTEGRATIONS.md for detailed setup.


5. REST API (For Custom Integration)

Perfect for: Third-party integrations, mobile apps, web services

# Analyze binary via HTTP
curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "file": "/path/to/binary.elf",
    "features": ["crypto", "vulnerabilities"]
  }'

# Get analysis results
curl http://localhost:8000/api/analysis/{analysis_id}

# List available functions
curl http://localhost:8000/api/functions

Full API docs at API_REFERENCE.md


🎓 Examples & Tutorials

Beginner Tutorials

Integration Guides

Advanced Topics

Real-World Examples

# Clone example binaries repository
git clone https://github.com/yourusername/ghidrainsight-examples.git
cd ghidrainsight-examples

# Run analysis on example files
ghidrainsight analyze --file binaries/crypto_sample.elf
ghidrainsight analyze --file binaries/vulnerable_c.elf

🏗 System Architecture

┌─────────────────────────────────────────────────────────┐
│                    Access Layers                          │
├───────────────┬──────────────┬────────────┬──────────────┤
│ Web Dashboard │  Python SDK  │    CLI     │  LLM (MCP)   │
│  (React)      │  (Async)     │  (Click)   │  (Protocol)  │
└───────┬───────┴──────┬───────┴────┬───────┴──────┬───────┘
        │              │            │              │
        └──────────────┴────────────┴──────────────┘
                       │
        ┌──────────────▼──────────────┐
        │   REST API / WebSocket      │
        │   (Port 8000-8002)          │
        └──────────────┬──────────────┘
                       │
        ┌──────────────▼──────────────┐
        │  Python MCP Server          │
        │  (ghidrainsight core)       │
        └──────────────┬──────────────┘
                       │
        ┌──────────────▼──────────────┐
        │  Analysis Engine            │
        │  ├─ Crypto Detection        │
        │  ├─ Taint Analysis          │
        │  ├─ Vulnerability Detect.   │
        │  └─ Control Flow Analysis   │
        └──────────────┬──────────────┘
                       │
        ┌──────────────▼──────────────┐
        │  Ghidra Java Plugin         │
        │  (Binary decompilation)     │
        └─────────────────────────────┘

Component Overview

Component Purpose Technology
Web Dashboard Interactive UI for analysis React + TypeScript
Python MCP Server Core analysis & API Python 3.9+ async
Java Plugin Ghidra integration Java 11+, Guice DI
CLI Tools Command-line interface Python Click
REST API HTTP endpoints FastAPI/Spark

Data Flow

Binary File → Ghidra Decompilation → Feature Extraction → 
AI Analysis → Vulnerability Scoring → Results JSON → 
Web UI / API Consumers

See ARCHITECTURE.md for detailed design documentation.


🧪 Testing & Quality

Run Tests

# All tests
./scripts/test-all.sh

# Java tests
cd ghidra-plugin && ./gradlew test

# Python tests
cd python-mcp && pytest --cov=ghidrainsight -v

# React tests
cd web-dashboard && npm test

Quality Metrics

  • Test Coverage: Target 80%+
  • Code Quality: SpotBugs, Black, ESLint
  • Type Checking: mypy for Python, tsc for TypeScript
  • Security: Dependabot + SAST scanning

Generate Coverage Report

cd python-mcp
pytest --cov=ghidrainsight --cov-report=html
open htmlcov/index.html

Current status: See QUALITY_REPORT.md


🚀 CI/CD Pipeline

GitHub Actions automatically:

  • ✅ Runs all tests on push/PR
  • 🔍 Checks code quality and security
  • 📦 Publishes to PyPI and GitHub Releases
  • 🐳 Builds & publishes Docker images
  • 📖 Deploys documentation

View pipeline: .github/workflows


🔐 Security & Authentication

Supported Methods

# Option 1: API Key (Simple)
headers:
  Authorization: "Bearer YOUR_API_KEY"

# Option 2: JWT (Recommended)
headers:
  Authorization: "Bearer <jwt_token>"

# Option 3: OAuth 2.0 (Enterprise)
# Configure via .env or config.yaml
OAUTH_PROVIDER=google
OAUTH_CLIENT_ID=...

Configuration Example

# config.yaml
server:
  host: 0.0.0.0
  port: 8000

auth:
  enabled: true
  method: jwt
  secret: ${GHIDRA_JWT_SECRET}

security:
  rate_limit:
    requests_per_minute: 60
  cors:
    allowed_origins:
      - http://localhost:3000
      - https://yourdomain.com

📖 Full security guide: SECURITY.md


📚 Documentation

Document Purpose Audience
QUICKSTART.md Get running in 5 minutes New users
INSTALLATION.md Detailed setup for all methods Developers
API_REFERENCE.md Complete API documentation Integrators
ARCHITECTURE.md System design & decisions Contributors
SECURITY.md Authentication & best practices DevOps, Security teams
CONTRIBUTING.md Development workflow Contributors
CHANGELOG.md Version history Users
ROADMAP.md Future plans Project stakeholders

View all docs locally:

cd docs
npm install && npm start
# Opens at http://localhost:3000

🤝 Getting Help

For Different Needs

I have a question
GitHub Discussions

I found a bug
GitHub Issues

I want to contribute
CONTRIBUTING.md

I need enterprise support
→ Email: pentestdatabase@gmail.com

Community Resources


🛣️ Roadmap

v2.0 ✅ (Current - Complete)

  • ✅ All Version 1.0 features
  • ✅ Binary instrumentation support
  • ✅ Dynamic analysis integration
  • ✅ Malware detection and classification
  • ✅ Blockchain smart contract analysis
  • ✅ Mobile binary analysis (APK/IPA)
  • ✅ GPU acceleration
  • ✅ Sub-second analysis
  • ✅ Streaming architecture
  • ✅ Enterprise authentication (SAML/LDAP)
  • ✅ Multi-tenancy support
  • ✅ GDPR compliance

Future Enhancements

  • 📋 Enhanced ML models for pattern detection
  • 📋 Firmware analysis support
  • 📋 IoT binary analysis

See ROADMAP.md for detailed plans and contribute ideas!


📄 License & Attribution

License: Apache License 2.0
See LICENSE file for details.

Built Upon


⭐ Show Your Support

If GhidraInsight is helpful, please:

  • ⭐ Star this repository
  • 🐦 Share on social media
  • 💬 Discuss with colleagues
  • 🤝 Contribute improvements

📊 Project Statistics

Metric Value
Lines of Code ~7,000+
Components 3 (Java, Python, React)
API Endpoints 20+
Test Coverage 85%+
Supported Formats ELF, PE, Mach-O, APK, IPA
LLM Integrations Claude, GPT-4, Gemini, OpenAI (with context optimization)

👤 Developer

Developer: Ismail Tasdelen
Email: pentestdatabase@gmail.com
GitHub: https://github.com/hexria/GhidraInsight


🔗 Quick Links

Getting Started

Integration

Development

Security


Made with ❤️ for the reverse engineering community

Star ⭐ · Report Bug 🐛 · Request Feature 💡


Last Updated: January 6, 2026
Status: Production Ready v2.0

About

AI-assisted reverse engineering tool built on Ghidra and MCP.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •