AgentFlow

AgentFlow is a Python-based framework for simulating and evaluating a multi-agent customer-support flow. It compares flawed (v0) and improved (v1) routing/agent prompts, logs conversations, computes quality metrics, and provides artifacts (analysis + visualization) to demonstrate improvements.

Key Features

🎯 Router + specialized agents (Search, Policy, Complaint, Booking, Closer) with v0/v1 prompt sets
🧪 Conversation simulation against predefined scenarios (5 conversations, 4–6 turns each)
📊 Metrics for routing accuracy, flow adherence, tool-call correctness, latency, specialization, and more
🔍 Failure analysis highlighting v0 issues and v1 fixes
🖼️ Visual flow diagram (HTML) and prompt analysis report

Repository Structure

main.py — runs simulations for v0 vs v1 prompts, logs turns, computes metrics, prints summaries.
Documentation/ANALYSIS.md — detailed failure patterns, improved prompts, and expected improvements.
Documentation/index.html — visual flow/architecture of the router + sub-agents.
requirements.txt — Python dependencies.
visualizer.py — optional graph visualizer (mermaid PNG) using chatbot.workflow (requires v4 and src.graph_compiler which are not included in this snapshot).
utils/cli.py — simple colored CLI print helpers.
readme.md — original minimal README (superseded by this document).

Prerequisites

Python 3.10+ recommended
pip (or uv/poetry) for dependency management

Installation

git clone https://github.com/ReWar1311/AgentFlow.git
cd AgentFlow
python -m venv .venv && source .venv/bin/activate   # or use your preferred venv
pip install -r requirements.txt

Configuration

The sample code in main.py currently uses a hardcoded API key and endpoint:

API_ENDPOINT = "https://research-interns.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview"
API_KEY = "cc3b2035419042a381b6d95df5585085"

You should replace these with your own secure values (e.g., via environment variables) before running:

export OPENAI_API_KEY="your-key"
export OPENAI_API_ENDPOINT="https://your-endpoint/..."

Then update main.py to read from environment variables instead of the hardcoded values.

Running the Evaluation

python main.py

What happens:

Simulates 5 conversation scenarios for v0 (flawed) and v1 (improved) prompts.
Prints routed agent, responses, tool calls, and per-turn latencies.
Computes and prints 10 metrics (7 required + 3 creative).
Exports logs/metrics objects in-memory (printed in the console).

Artifacts & Documentation

Prompt and failure analysis: see Documentation/ANALYSIS.md.
Flow visualization (static HTML): open Documentation/index.html in a browser.

Mermaid PNG visualization (optional):

python visualizer.py graph        # may require missing modules (v4, src.graph_compiler) to be present

Notes & Limitations

The visualizer references v4.chatbot and src.graph_compiler, which are not included in this snapshot; ensure those modules exist or adjust imports accordingly.
The hardcoded API key in main.py should be removed/secured before sharing or deploying.
No automated tests are included; add unit tests if you extend the routing or agent logic.

Roadmap Ideas

Externalize configuration (API keys/endpoints, model params) via .env or CLI flags.
Add test coverage for routing accuracy and tool-call discipline.
Parameterize scenarios and metrics export (JSON/CSV) for downstream analysis.
Bundle a CLI/Streamlit demo for interactive runs.

License

(Please add a LICENSE file to clarify usage.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AgentFlow

Key Features

Repository Structure

Prerequisites

Installation

Configuration

Running the Evaluation

Artifacts & Documentation

Notes & Limitations

Roadmap Ideas

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Documentation		Documentation
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
thumbnail.png		thumbnail.png
visualizer.py		visualizer.py

ReWar1311/AgentFlow

Folders and files

Latest commit

History

Repository files navigation

AgentFlow

Key Features

Repository Structure

Prerequisites

Installation

Configuration

Running the Evaluation

Artifacts & Documentation

Notes & Limitations

Roadmap Ideas

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages