🚀 ViperMCP: A Model Context Protocol for Viper Server

Mixture-of-Experts VQA, streaming-ready, and MCP-native.

ViperMCP is a mixture-of-experts (MoE) visual question‑answering (VQA) server that exposes streamable MCP tools for:

🔎 Visual grounding
🧩 Compositional image QA
🌐 External knowledge‑dependent image QA

It’s built on the shoulders of 🐍 ViperGPT and delivered as a FastMCP HTTP server, so it works with all FastMCP client tooling.

✨ Highlights

⚡ MCP-native JSON‑RPC 2.0 endpoint (/mcp/) with streaming
🧠 MoE routing across classic and modern VLMs/LLMs
🧰 Two tools out of the box: viper_query (text) & viper_task (crops/masks)
🐳 One‑command Docker or pure‑Python install
🔐 Secure key handling via env var or secret mount

⚙️ Setup

🔑 OpenAI API Key

An OpenAI API key is required. Provide it via one of the following:

OPENAI_API_KEY (environment variable)
OPENAI_API_KEY_PATH (path to a file containing the key)
?apiKey=... HTTP query parameter (for quick local testing)

🌐 Ngrok (Optional)

Use ngrok to expose your local server:

pip install ngrok
ngrok http 8000

Use the ngrok URL anywhere you see http://0.0.0.0:8000 below.

🛠️ Installation

🐳 Option A: Dockerized FastMCP Server (GPU‑ready)

Save your key to api.key, then run:

docker run -i --rm \
  --mount type=bind,source=/path/to/api.key,target=/run/secrets/openai_api.key,readonly \
  -e OPENAI_API_KEY_PATH=/run/secrets/openai_api.key \
  -p 8000:8000 \
  rsherby/vipermcp:latest

This starts a CUDA‑enabled container serving MCP at:

http://0.0.0.0:8000/mcp/

💡 Prefer building from source? Use the included docker-compose.yaml. By default it reads api.key from the project root. If your platform injects env vars, you can also set OPENAI_API_KEY directly.

🐍 Option B: Pure FastMCP Server (dev‑friendly)

git clone --recurse-submodules https://github.com/ryansherby/ViperMCP.git
cd ViperMCP
bash download-models.sh

# Store your key for local dev
echo YOUR_OPENAI_API_KEY > api.key

# (recommended) activate a virtualenv / conda env
pip install -r requirements.txt
pip install -e .

# run the server
python run_server.py

Your server should be live at:

http://0.0.0.0:8000/mcp/

To use OpenAI‑backed models via query param:

http://0.0.0.0:8000/mcp?apiKey=sk-proj-XXXXXXXXXXXXXXXXXXXX

🧪 Usage

🤝 FastMCP Client Example

Pass images as base64 (shown) or as URLs:

image_path='./your_image.png'
img_byte_arr = io.BytesIO()
image.save(img_byte_arr, format='PNG')
img_byte_arr.seek(0)
image_bytes = img_byte_arr.read()
img_b64_string = base64.b64encode(image_bytes).decode('utf-8')

async with client:
    await client.ping()

    tools = await client.list_tools()  # optional

    query = await client.call_tool(
        "viper_query",
        {"query": "how many muffins can each kid have for it to be fair?"},
        {"image": f"data:image/png;base64,{img_b64_string}"},
    )

    task = await client.call_tool(
        "viper_task",
        {"task": "return a mask of all the people in the image"},
        {"image": f"data:image/png;base64,{img_b64_string}"},
    )

🧵 OpenAI API (MCP Integration)

The OpenAI MCP integration currently accepts image URLs (not raw base64). Send the URL as type: "input_text".

response = client.responses.create(
    model="gpt-4o",
    tools=[
        {
            "type": "mcp",
            "server_label": "ViperMCP",
            "server_url": f"{server_url}/mcp/",
            "require_approval": "never",
        },
    ],
    input=[
        {"role": "system", "content": "Forward any queries or tasks relating to an image directly to the ViperMCP server."},
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "based on this image, how many muffins can each kid have for it to be fair?"},
                {"type": "input_text", "text": img_url},
            ],
        },
    ],
)

🌐 Endpoints

🔓 HTTP GET Endpoints

GET /health      => 'OK' (200)
GET /device      => {"device": "cuda"|"mps"|"cpu"}
GET /mcp?apiKey= => 'Query parameters set successfully.'

🧠 MCP Client Endpoints (JSON‑RPC 2.0)

POST /mcp/

🔨 MCP Client Functions

viper_query(query, image) -> str
# Returns a text answer to your query.

viper_task(task, image) -> list[Image]
# Returns a list of images (e.g., masks) satisfying the task.

🧩 Models (Default MoE Pool)

🐊 Grounding DINO
✂️ Segment Anything (SAM)
🤖 GPT‑4o‑mini (LLM)
👀 GPT‑4o‑mini (VLM)
🧠 GPT‑4.1
🔭 X‑VLM
🌊 MiDaS (depth)
🐝 BERT

🧭 The MoE router picks from these based on the tool & prompt.

⚠️ Security & Production Notes

This package may generate and execute code on the host. We include basic injection guards, but you must harden for production. A recommended architecture separates concerns:

MCP Server (Query + Image)
  => Client Server (Generate Code Request)
    => Backend Server (Generates Code)
      => Client Server (Executes Wrapper Functions)
        => Backend Server (Executes Underlying Functions)
          => Client Server (Return Result)
            => MCP Server (Respond)

🧱 Isolate codegen & execution.
🔒 Lock down secrets & file access.
🧪 Add unit/integration tests around wrappers.

📚 Citations

Huge thanks to the ViperGPT team:

@article{surismenon2023vipergpt,
    title={ViperGPT: Visual Inference via Python Execution for Reasoning},
    author={D'idac Sur'is and Sachit Menon and Carl Vondrick},
    journal={arXiv preprint arXiv:2303.08128},
    year={2023}
}

🤝 Contributions

PRs welcome! Please:

✅ Ensure all tests in /tests pass
🧪 Add coverage for new features
📦 Keep docs & examples up to date

🧭 Quick Commands Cheat‑Sheet

# Run with Docker (mount key file)
docker run -i --rm \
  --mount type=bind,source=$(pwd)/api.key,target=/run/secrets/openai_api.key,readonly \
  -e OPENAI_API_KEY_PATH=/run/secrets/openai_api.key \
  -p 8000:8000 rsherby/vipermcp:latest

# From source (after setup)
python run_server.py

# Hit health
curl http://0.0.0.0:8000/health

# List device
curl http://0.0.0.0:8000/device

# Use query param key (local only)
curl "http://0.0.0.0:8000/mcp?apiKey=sk-proj-XXXX..."

💬 Questions?

Open an issue or start a discussion. We ❤️ feedback and ambitious ideas!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
tests		tests
viper		viper
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
download-models.sh		download-models.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_server.py		run_server.py
smithery.yaml		smithery.yaml
test-requirements.txt		test-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 ViperMCP: A Model Context Protocol for Viper Server

✨ Highlights

⚙️ Setup

🔑 OpenAI API Key

🌐 Ngrok (Optional)

🛠️ Installation

🐳 Option A: Dockerized FastMCP Server (GPU‑ready)

🐍 Option B: Pure FastMCP Server (dev‑friendly)

🧪 Usage

🤝 FastMCP Client Example

🧵 OpenAI API (MCP Integration)

🌐 Endpoints

🔓 HTTP GET Endpoints

🧠 MCP Client Endpoints (JSON‑RPC 2.0)

🔨 MCP Client Functions

🧩 Models (Default MoE Pool)

⚠️ Security & Production Notes

📚 Citations

🤝 Contributions

🧭 Quick Commands Cheat‑Sheet

💬 Questions?

About

Uh oh!

Releases

Packages

Languages

License

ryansherby/ViperMCP

Folders and files

Latest commit

History

Repository files navigation

🚀 ViperMCP: A Model Context Protocol for Viper Server

✨ Highlights

⚙️ Setup

🔑 OpenAI API Key

🌐 Ngrok (Optional)

🛠️ Installation

🐳 Option A: Dockerized FastMCP Server (GPU‑ready)

🐍 Option B: Pure FastMCP Server (dev‑friendly)

🧪 Usage

🤝 FastMCP Client Example

🧵 OpenAI API (MCP Integration)

🌐 Endpoints

🔓 HTTP GET Endpoints

🧠 MCP Client Endpoints (JSON‑RPC 2.0)

🔨 MCP Client Functions

🧩 Models (Default MoE Pool)

⚠️ Security & Production Notes

📚 Citations

🤝 Contributions

🧭 Quick Commands Cheat‑Sheet

💬 Questions?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages