Implement autonomous recipe authoring system for RecipeKit #33
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a fully autonomous system that generates, validates, and iteratively repairs RecipeKit scraping recipes from a single URL using AI-powered classification and evidence-based debugging.
System Architecture
Entry point:
node scripts/autoRecipe.js --url=https://example.comFour-phase autonomous workflow:
Implementation
Core orchestrator (
scripts/autoRecipe.js, 27KB):AI prompt templates (
scripts/prompts/, 12KB):classify.md- Website topic classification with strict JSONauthor-autocomplete.md- Search recipe generationauthor-url.md- Detail page extractionfixer.md- Evidence-based recipe repairDocumentation (25KB):
Integration Points
Ready for integration (placeholder implementations with TODO markers):
Working integrations:
bun run Engine/engine.js)RecipeKit Engine Constraints
System generates recipes using supported commands only:
load,api_requeststore,store_attribute,store_text,store_array,store_url,json_store_textregex,url_encode,replaceNo interactive commands (click, fill, type) - workaround uses direct search URLs with query parameters.
Statistics
Original prompt
Below is a clean, end-to-end rewrite of the full development plan, incorporating the exact Node.js Copilot SDK API you provided, plus a tight appendix with only the relevant agent-browser commands, Copilot SDK calls, and RecipeKit engine flags.
No hand-waving. No guessed APIs. This is implementable.
⸻
Autonomous Recipe Authoring for RecipeKit
(using agent-browser + Copilot SDK + Copilot CLI)
Objective
Build a fully autonomous system that, given a single URL, can:
1. Infer the main semantic topic of a website.
2. Decide where to store the recipe (folder + filename).
3. Automatically generate a valid RecipeKit recipe:
• First autocomplete_steps
• Then url_steps (detail)
4. Validate correctness by generating tests and running the RecipeKit engine.
5. Iteratively repair the recipe using:
• More web probing via agent-browser
• Copilot acting as author and fixer
6. Stop only when tests pass or a hard failure condition is reached.
Entrypoint:
node scripts/autoRecipe.js --url=https://some-website.com
No list type flags. No manual hints. Autonomy means autonomy.
⸻
System roles (non-negotiable)
agent-browser
The ground truth probe.
It navigates the website, reaches the correct UI state, and extracts minimal, structured evidence.
Copilot (via Copilot SDK + Copilot CLI)
The reasoning agent.
It:
• infers topic
• proposes folder
• writes RecipeKit steps
• fixes broken recipes based on test failures and new evidence
autoRecipe.js
The judge and orchestrator.
It:
• validates Copilot output
• enforces naming and schema rules
• writes files
• generates tests
• runs the RecipeKit engine
• controls the repair loop
Copilot proposes.
Your script disposes.
⸻
Phase 0: Repo structure (once)
Add the following structure:
recipes/
/
.json
tests/
generated/
/
.autocomplete.test.ts
.url.test.ts
scripts/
autoRecipe.js
prompts/
classify.md
author-autocomplete.md
author-url.md
fixer.md
Everything the agent generates must be committed or explicitly rejected.
⸻
Phase 1: Initial web probing and topic inference
Step 1.1 – Probe the website (agent-browser)
Given --url, the script:
1. Opens the page.
2. Takes a snapshot with refs.
3. Extracts:
• page title
• meta description
• main heading
• 1 representative “content card” if visible
• any JSON-LD structured data if present
This becomes the site fingerprint.
Do not dump full HTML.
Minimal, relevant evidence only.
⸻
Step 1.2 – Ask Copilot to classify and choose storage
Create a Copilot session using the real SDK:
const client = new CopilotClient();
await client.start();
const session = await client.createSession({
model: "gpt-5",
systemMessage: {
content:
You are an autonomous agent that classifies websites and authors RecipeKit scraping recipes. Always respond with STRICT JSON. No prose.}
});
Send the classification prompt:
await session.send({
prompt: `
Given the following website fingerprint, infer:
Website fingerprint:
<fingerprint_json_here>
Respond with:
{
"topic": "...",
"folder": "...",
"confidence": 0.0,
"rationale": "..."
}
`
});
Wait for assistant.message and session.idle.
⸻
Step 1.3 – Validate and canonicalize folder (script-side)
Copilot does not get the final say.
Your script enforces:
• lowercase
• [a-z0-9-] only
• max 32 chars
• canonical mappings:
• film, cinema → movies
• novel, reading → books
• cooking, food → recipes
• shop, ecommerce → products
If invalid:
• send a fix request back to Copilot
• loop until valid or fail hard
Final storage path:
recipes//.json
⸻
Phase 2: Autocomplete recipe generation (closed loop)
Step 2.1 – Generate autocomplete_steps (Copilot author mode)
Provide Copilot with:
• site fingerprint
• snapshot evidence
• hints about search UI (if any)
• explicit instruction: use only supported RecipeKit commands
Copilot must respond with STRICT JSON:
{
"recipe": {
"title": "...",
"description": "...",
"engine_version": "1",
"url_available": ["..."],
"autocomplete_steps": [ ... ]
},
"testPlan": {
"queries": ["query1", "query2"]
}
}
No markdown. No explanation.
⸻
Step 2.2 – Write recipe and generate tests
Script writes:
recipes//.json
Then generates a test that runs:
bun run ./Engine/engine.js
--recipe recipes//.json
--type autocomplete
--input ""
Assertions:
• stdout parses as JSON
• result is an array
• length ≥ 3
• each item has:
• title
• URL
⸻
Step 2.3 – Run test and classify failures
If test fails, classify:
• selector missing
• JS-rendered content
• wrong URL pattern
• search flow incomplete
• bot wall / consent gate
This classification is done by the script, not Copilot.
⸻
Step 2.4 – Repair loop (autocomplete)
If more info is needed:
1. agent-brow...
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.