Autonomous agentic QA system that integrates with CI to analyze PRs, generate test suites, and execute them with computer-use agents.
- PR → Tests automatically: Submitting a PR triggers a pipeline that uses LLMs to analyze changes and synthesize suites (ex. personas like seniors, or categories like authentication) and tests that are often missed by humans (ex. edge cases, race conditions)
- Multi-suite execution: Up to 4 suites run concurrently on pre-configured VMs that computer-use agents can navigate via Cua
- Evidence and traceability: Video recordings and condensed step logs for each test to facilitate bug reproduction and fixes
- API-driven: FastAPI service to run single suites or full results; simple to integrate with any CI provider
- CI calls the pipeline for the PR
- Pipeline analyzes PR diff + codebase summary to generate scenarios
- Scenarios are saved to Supabase as:
results(one per PR)suites(categories, up to 4)tests(2–3 high-value tests per suite where meaningful)
- Pipeline triggers the Agent Runner API (
/run-result) to execute all suites - Agents interact with a browser VM, record videos, and stream steps back
- The API updates
tests, then aggregates toresults.overall_resultandrun_status
backend/
agents/ # FastAPI service, agent runner, DB integration
cicd/ # CI pipeline scripts
tests/ # Python tests
frontend/ # Next.js dashboard
API.md # Detailed API and DB schema
README.md # This file
backend/agents/main.py: FastAPI with endpointsGET /healthPOST /run-suite(single suite bysuite_id)POST /run-result(run all suites for aresult_id)POST /run-agents(submit explicittest_specs)
backend/agents/runner.py: Orchestrates agent runs, recording, DB updatesrun_suites_for_result(result_id): assigns containers and runs suites concurrentlyrun_single_agent(spec): executes all tests for a suite
backend/agents/database.py: Supabase helpers forresults,suites,testsbackend/cicd/qai-pipeline.js: CI entrypoint- Generates scenarios with OpenAI (rich
summaryfor each test) - Writes to Supabase (
results/suites/tests) - Calls
POST /run-resultand verifies final state - Mirrors files to
backend/artifacts/agent/<runId>
- Generates scenarios with OpenAI (rich
- Python 3.11+
- Node.js 18+
- Supabase project
- OpenAI API key
- Agent compute provider (CUA) credentials
- AWS S3 bucket
Create a .env in the repo root (consumed by both backend and cicd):
Backend:
SUPABASE_URL/SUPABASE_KEYCUA_API_KEY(agent provider API key)CUA_MODEL(e.g.,anthropic/claude-3-5-sonnet-20241022)CUA_CONTAINER_1..CUA_CONTAINER_4(names/ids of up to 4 agent containers)AWS_REGION,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,S3_BUCKET_NAME(optional, recording uploads)
Pipeline (CI):
OPENAI_API_KEYGITHUB_TOKENGITHUB_REPOSITORY,GITHUB_EVENT_PATH(GitHub Actions)QAI_ENDPOINT(e.g.,http://localhost:8000for API)DEPLOYMENT_URL(target app base URL; used in summaries)AGENT_TIMEOUT(ms; default 600000)
See API.md for canonical DDL. Current tables include:
results:id,pr_link,pr_name,overall_result(jsonb),run_statussuites:id,result_id,nametests:id,suite_id,name,summary,test_success,run_status,steps(jsonb),s3_link
- Install dependencies
# Python backend
cd backend/agents
python -m venv .venv && . .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Node cicd
cd ../cicd
npm ci
- Start the API
# From repo root
python -m uvicorn backend.agents.main:app --host 0.0.0.0 --port 8000
- Run the pipeline end-to-end
# From backend/cicd
node qai-pipeline.js full
# or
node qai-pipeline.js analyze
node qai-pipeline.js test
If suites/tests have already been created in Supabase (e.g., via the pipeline), you can run them by result_id:
curl -X POST "$QAI_ENDPOINT/run-result" \
-H 'Content-Type: application/json' \
-d '{"result_id": 123}'
cd frontend
npm ci
npm run dev
# open http://localhost:3000