This system provides an agentic AI experience as if you were leading an Penetration Test Team!
VibeHackAI is an interactive penetration testing support system that leverages Claude Code's agent capabilities and MCP (Model Context Protocol). Four specialized agents (Planner, Reconnaissance, Enumeration, Exploitation) work in coordination with an Orchestrator to execute safe and efficient security assessments under human supervision.
-
vs. Autonomous penetration tools β VibeHackAI combines AI and human reasoning to prevent uncontrolled AI behavior. The human reviews the AI's plan, validates the logic, and provides course corrections before any action is taken.
-
vs. PentestGPT-style tools β While PentestGPT requires humans to manually type and execute every command, VibeHackAI's AI handles command execution across all testing phases. Humans focus on strategic decisions rather than operational details.
The result: Higher success rates through collaborative intelligence. Humans contribute domain expertise and judgment; AI contributes speed, consistency, and comprehensive analysis. Neither works aloneβboth work together.
Fully autonomous penetration testing tools face fundamental limitations:
| Problem | Impact |
|---|---|
| Scope violations | AI scans unrelated hosts without understanding authorization boundaries |
| False confidence | AI reports "confirmed" vulnerabilities that don't exist |
| Dangerous actions | AI executes destructive payloads without understanding consequences |
| Context loss | AI forgets previous findings and repeats failed approaches |
VibeHackAI addresses these issues by keeping humans in the decision loop. The AI handles analysis and suggestions; you make the final call on every significant action.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Human Interface β
β (Approval, Interaction, Oversight) β
ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββ
β Orchestrator Agent β γββββββββββββββββββββββββββββββββββββββββββββββββββββ
β (Control Plane - Writer) β γ β Shared Workspace β
β βββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ β γ β ββββββββββββββ ββββββββββββββ ββββββββββββββ β
β β State β Approval β Agent β β ββββββββΆοΈ β βState Store β βEvidence β βRetrieval β β
β β Management β Gates β Routing β β γβ β(Normalized)β βStore β βCache β β
β βββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ β γ β ββββββββββββββ ββββββββββββββ ββββββββββββββ β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ γ ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββ¬βββββββββββββββββββββ¬βββββββββββββββββββββ
β β β β
βββββββββΌβββββββββ βββββββββββΌβββββββββ βββββββββΌβββββββββ βββββββββΌββββββββ
β Reconnaissance β β Enumeration β β Exploitation β β Planner β
β Agent β β Agent β β Agent β β Agent β
ββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββ βββββββββββββββββ
| Agent | Role |
|---|---|
| Orchestrator | Control plane responsible for phase transitions, approval gates, and state management |
| Planner | CVE research, attack planning, and CVSS evaluation |
| Reconnaissance | Passive/active information gathering (OSINT, Nmap, Shodan, etc.) |
| Enumeration | Service enumeration and vulnerability candidate identification |
| Exploitation | Exploit execution based on approved plans |
- Scope Enforcement: All operations tagged with scope_tag to prevent out-of-scope access
- Approval Gates: Dangerous operations require human approval
- Evidence Management: All operation results stored in append-only Evidence Store
- Automatic Stop Conditions: Auto-halt on consecutive errors or DoS indicators
# Clone and setup
git clone https://github.com/cawa102/VibeHackAI.git
cd VibeHackAI
cp .mcp.json.example .mcp.json
pip install -e .
# Launch Claude Code and start
Please launch pentest-orchestrator.
Target: example.com
Scope: Web application assessment| Requirement | Version | Link |
|---|---|---|
| Claude Code CLI | Latest | Installation Guide |
| Docker | Latest | docker.com |
| Python | 3.10+ | python.org |
| hexstrike-ai MCP Server | Required | Setup Guide β |
Important: hexstrike-ai MCP Server must be set up before using VibeHackAI. π Follow the instructions at github.com/0x4m4/hexstrike-ai
1. Setup hexstrike-ai MCP Server
First, set up the hexstrike-ai MCP server by following the instructions at:
π https://github.com/0x4m4/hexstrike-ai
Make sure the server is running before proceeding.
2. Clone the Repository
git clone https://github.com/cawa102/VibeHackAI.git
cd VibeHackAI3. MCP Configuration
Copy .mcp.json.example to .mcp.json and configure appropriately:
cp .mcp.json.example .mcp.jsonSet the required environment variables:
GITHUB_PERSONAL_ACCESS_TOKEN: Token for GitHub API- hexstrike-ai server endpoint configuration (see hexstrike-ai docs)
4. Install Dependencies
pip install -e .- Launch Claude Code
- Provide target information (IP/CIDR/Domain)
- Invoke the Orchestrator agent
Please launch pentest-orchestrator.
Target: example.com (192.168.1.0/24)
Scope: Web application assessment
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βββββββββββββββββ β βββββββββββββββββ βββββββββββββββββ β
β π€ Human ββββTargetββββΆβββΌβββΆβ π― OrchestratorββββββββββΆβ π Planner β β
βββββββββββββββββ β βββββββββββββββββ βββββββββββββββββ β
β β β β β
β β β β β
βββββββββΌββββββββ β βββββββββΌββββββββ βββββββββΌββββββββ β
β Approval ββββPhaseBriefβββΌβββββ State Mgmt βββPatchβββ TestPlan β β
βββββββββββββββββ β βββββββββββββββββ βββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
β π RECONNAISSANCE β β π ENUMERATION β β β‘ EXPLOITATION β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββ β β βββββββββββββββββββββββ β
β β β’ OSINT / Shodan β β β β β’ Service Analysis β β β β β’ PoC Execution β β
β β β’ Nmap Scanning β β β β β’ Entry Points β β β β β’ Metasploit β β
β β β’ DNS Enumeration β β β β β’ Auth Boundaries β β β β β’ Custom Payloads β β
β βββββββββββββββββββββββ β β βββββββββββββββββββββββ β β βββββββββββββββββββββββ β
β β β β β β β β β
β βββββββΌββββββ β β βββββββΌββββββ β β βββββββΌββββββ β
β β Result? β β β β Result? β β β β Result? β β
β βββββββββββββ β β βββββββββββββ β β βββββββββββββ β
β β β β β β β β β β β β
β Fail Success β β Fail Success β β Fail Success β
β β β β β β β β β β β β
β βββββΌββββ β β β βββββΌββββ β β β βββββΌββββ β β
β β Retry β β β β β Retry β β β β β Retry β β β
β β π βββββ β β β π βββββ β β β π βββββ β
β βββββββββ β β βββββββββ β β βββββββββ β
βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
β β β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β π POST-EXPLOITATION LOOP β
β βββββββββββββββββββββββββββββββββ β
β β Planner evaluates: β β
β β β’ Privilege escalation? β β
β β β’ Lateral movement? β β
β β β’ Additional attack vectors? β β
β βββββββββββββββββββββββββββββββββ β
β β β β
β More Tests Complete β
β β β β
β ββββββΌβββββ ββββββΌβββββ β
β β π€ Ask β β π Reportβ β
β β Human β β Generate β β
β ββββββ¬βββββ ββββββββββββ β
β β β
β Approved βββΆ Back to Exploitation β
βββββββββββββββββββββββββββββββββββββββ
|
π Never Give Up
|
β Human Controls Everything
|
Each penetration testing session maintains an isolated workspace for state management, evidence collection, and reporting.
/workspace/sessions/<session_id>/
βββ π state/ # Normalized state (Orchestrator write-only)
β βββ scope.json # Target scope definition
β βββ target_profile.json # Discovered target information
β βββ candidates_vuln.json # Vulnerability candidates
β βββ candidates_exploit.json # Exploit candidates
β βββ execution_plans.json # Approved execution plans
β βββ findings.json # Confirmed findings
β βββ state_version.json # State version tracking
β
βββ π¦ evidence/ # Raw data (append-only, sha256 verified)
β βββ <evidence_id>/
β βββ raw.<ext> # Raw tool output
β βββ meta.json # Metadata (timestamp, tool, params)
β
βββ ποΈ cache/ # Query result cache
β βββ cve/ # CVE lookup cache
β βββ snyk/ # Snyk vulnerability cache
β βββ git/ # Git repository cache
β
βββ π reports/ # Final deliverables
βββ draft.md # Generated penetration test report
| Directory | Purpose | Write Policy |
|---|---|---|
state/ |
Tracks current session state, targets, and findings | Orchestrator only |
evidence/ |
Stores all raw tool outputs with integrity verification | Append-only |
cache/ |
Caches external API responses (CVE, Snyk) | Read/Write |
reports/ |
Contains final penetration test reports | Write on completion |
Note: All evidence is stored with SHA-256 hash verification to ensure integrity and reproducibility.
Core Documentation
| Document | Contents |
|---|---|
| CLAUDE.md | System Guidance (Main) |
| docs/001_shared_workspace.md | Shared Workspace Specification |
| docs/002_common_schema.md | Common Schema Definitions |
| docs/003_passer.md | Normalization Engine Specification |
| docs/004_patch_protocol.md | Patch Protocol Specification |
| docs/tool_manifest.yaml | Available Tools List |
Agent Specifications
| Agent | Specification |
|---|---|
| Orchestrator | .claude/agents/pentest-orchestrator.md |
| Reconnaissance | .claude/agents/reconnaissance-agent.md |
| Enumeration | .claude/agents/enumeration-agent.md |
| Planner | .claude/agents/planner-agent.md |
| Exploitation | .claude/agents/exploitation-agent.md |
- Web UI Dashboard
- Multi-target parallel scanning
- Custom plugin system
- Report template customization
- Integration with more MCP servers
Warning: Use this system only against authorized targets
- Conduct all penetration tests with proper authorization
- Indiscriminate scanning, DoS attacks, and data exfiltration are prohibited
- This tool is for educational and authorized security testing only
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- π Report bugs
- π‘ Request features
- π Submit PRs
MIT License - See LICENSE for details.
- Built on Model Context Protocol (MCP) by Anthropic
- Inspired by PentestGPT, Hexstrike
This project integrates with the following open-source MCP servers:
| Server | Repository | Description |
|---|---|---|
| GitHub MCP | github/github-mcp-server | GitHub's official MCP server |
| Filesystem MCP | @modelcontextprotocol/server-filesystem | Anthropic's official filesystem server |
| Hexstrike MCP | github/github-mcp-server | 150+ Tools Integration |
We thank all the developers and maintainers of these projects for their contributions to the security community!
If you find this project useful, please consider giving it a β