This repository was archived by the owner on Nov 15, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
This repository was archived by the owner on Nov 15, 2025. It is now read-only.
[Epic] Prompt Optimization Framework – bbeval opt #15
Copy link
Copy link
Open
Feature
0 / 40 of 4 issues completed
Copy link
Description
Goal
Transform bbeval into a complete, production-grade prompt optimization platform that:
- Generates high-quality, structured prompts in Markdown, YAML, or SudoLang
- Enforces company guidelines (attached via
--guidelines) - Works across all agent types with <10 LLM calls
- Outputs Copilot-ready
.prompt.mdfiles
Scope: 3 Modes + 1 Engine
| Mode | Use Case | Input | Output | Optimizer |
|---|---|---|---|---|
direct-llm |
Non-agentic reviews (SQL, schema, docs) | Base .prompt.md + guidelines |
Optimized .prompt.md |
BootstrapRS |
dspy-agent |
ReAct, tool-calling agents | DSPy module + AgentGateway | YAML config + MCP routes | BootstrapRS |
external-wrapper |
Copilot,840 Codex, Claude Desktop | Mock output + refiner | Post-process .py + .prompt.md |
COPRO / SIMBA |
Core Requirements
-
bbeval optsubcommandbbeval opt tests.yaml base.prompt.md \ --mode direct-llm \ --guidelines constraints.yaml \ --format markdown \ --output optimized.prompt.md
-
Guideline Reinforcement Engine (Child Issue #101)
- Enforces rules during optimization
- Supports Markdown, YAML, SudoLang
- Injects attachments (
{{guidelines}}→ PDF text)
-
Universal Metric
bbeval run --jsonlas scoring oracle- Supports code execution, SQL linting, semantic match
-
Output Format (Auto-Selected)
Format When Example Markdown (default) Single workflow .prompt.mdwith headers, toolsYAML Multi-workflow workflow: [step: analyze, tools: [...] ]SudoLang Loops, state, branching for each file, lint(file)
Success Metrics
| Metric | Target |
|---|---|
| Manual tuning time | 90% reduction |
| Guideline compliance | 100% |
| LLM calls per run | ≤10 |
| Copilot-ready output | Drop-in .prompt.md |
| Format correctness | Zero invalid outputs |
Example: Optimized .prompt.md
---
description: 'Review SQL schema changes'
mode: 'agent'
tools: ['runInTerminal', 'getTerminalOutput', 'edit']
---
# SQL Schema Change Reviewer
You are a fintech DBA with 15 years in high-frequency trading systems.
{{guidelines}}
## Task
Review `${selection}` for:
- Index coverage on WHERE/JOIN columns
- No `SELECT *` in production views
- Partitioning on date columns
## Instructions
1. Run `EXPLAIN ANALYZE` on critical queries
2. Check for missing indexes
3. If issue found: emit fix with `edit`
4. Persist until all checks pass
## Output
```diff
- -- Missing index
+ CREATE INDEX idx_orders_date ON orders(order_date);
```Sub-issues
Metadata
Metadata
Assignees
Labels
No labels