[Epic] Prompt Optimization Framework – `bbeval opt`

## Goal

Transform `bbeval` into a **complete, production-grade prompt optimization platform** that:
- Generates **high-quality, structured prompts** in **Markdown**, **YAML**, or **SudoLang**
- **Enforces company guidelines** (attached via `--guidelines`)
- Works across **all agent types** with **<10 LLM calls**
- Outputs **Copilot-ready `.prompt.md` files**

---

## Scope: 3 Modes + 1 Engine

| Mode | Use Case | Input | Output | Optimizer |
|------|--------|------|--------|----------|
| `direct-llm` | Non-agentic reviews (SQL, schema, docs) | Base `.prompt.md` + guidelines | Optimized `.prompt.md` | `BootstrapRS` |
| `dspy-agent` | ReAct, tool-calling agents | DSPy module + AgentGateway | YAML config + MCP routes | `BootstrapRS` |
| `external-wrapper` | Copilot,840 Codex, Claude Desktop | Mock output + refiner | Post-process `.py` + `.prompt.md` | `COPRO` / `SIMBA` |

---

## Core Requirements

1. **`bbeval opt` subcommand**
   ```bash
   bbeval opt tests.yaml base.prompt.md \
     --mode direct-llm \
     --guidelines constraints.yaml \
     --format markdown \
     --output optimized.prompt.md
   ```

2. **Guideline Reinforcement Engine** (Child Issue #101)
   - Enforces rules **during optimization**
   - Supports **Markdown**, **YAML**, **SudoLang**
   - Injects attachments (`{{guidelines}}` → PDF text)

3. **Universal Metric**
   - `bbeval run --jsonl` as scoring oracle
   - Supports code execution, SQL linting, semantic match

4. **Output Format (Auto-Selected)**

   | Format | When | Example |
   |-------|------|--------|
   | **Markdown** (default) | Single workflow | `.prompt.md` with headers, tools |
   | **YAML** | Multi-workflow | `workflow: [step: analyze, tools: [...] ]` |
   | **SudoLang** | Loops, state, branching | `for each file, lint(file)` |

---

## Success Metrics

| Metric | Target |
|-------|--------|
| Manual tuning time | **90% reduction** |
| Guideline compliance | **100%** |
| LLM calls per run | **≤10** |
| Copilot-ready output | **Drop-in `.prompt.md`** |
| Format correctness | **Zero invalid outputs** |

---

## Example: Optimized `.prompt.md`

```markdown
---
description: 'Review SQL schema changes'
mode: 'agent'
tools: ['runInTerminal', 'getTerminalOutput', 'edit']
---

# SQL Schema Change Reviewer

You are a fintech DBA with 15 years in high-frequency trading systems.

{{guidelines}}

## Task
Review `${selection}` for:
- Index coverage on WHERE/JOIN columns
- No `SELECT *` in production views
- Partitioning on date columns

## Instructions
1. Run `EXPLAIN ANALYZE` on critical queries
2. Check for missing indexes
3. If issue found: emit fix with `edit`
4. Persist until all checks pass

## Output

    ```diff
    - -- Missing index
    + CREATE INDEX idx_orders_date ON orders(order_date);
    ```
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Epic] Prompt Optimization Framework – `bbeval opt` #15

Goal

Scope: 3 Modes + 1 Engine

Core Requirements

Success Metrics

Example: Optimized `.prompt.md`

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mode	Use Case	Input	Output	Optimizer
`direct-llm`	Non-agentic reviews (SQL, schema, docs)	Base `.prompt.md` + guidelines	Optimized `.prompt.md`	`BootstrapRS`
`dspy-agent`	ReAct, tool-calling agents	DSPy module + AgentGateway	YAML config + MCP routes	`BootstrapRS`
`external-wrapper`	Copilot,840 Codex, Claude Desktop	Mock output + refiner	Post-process `.py` + `.prompt.md`	`COPRO` / `SIMBA`

Format	When	Example
Markdown (default)	Single workflow	`.prompt.md` with headers, tools
YAML	Multi-workflow	`workflow: [step: analyze, tools: [...] ]`
SudoLang	Loops, state, branching	`for each file, lint(file)`

Metric	Target
Manual tuning time	90% reduction
Guideline compliance	100%
LLM calls per run	≤10
Copilot-ready output	Drop-in `.prompt.md`
Format correctness	Zero invalid outputs

[Epic] Prompt Optimization Framework – bbeval opt #15

Description

Goal

Scope: 3 Modes + 1 Engine

Core Requirements

Success Metrics

Example: Optimized .prompt.md

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Epic] Prompt Optimization Framework – `bbeval opt` #15

Example: Optimized `.prompt.md`