QAI (Hack the North 2025 Submission)

Autonomous agentic QA system that integrates with CI to analyze PRs, generate test suites, and execute them with computer-use agents.

qai-zeta vercel app_1000_test-suites_68(MacBook Air)

How It Works

PR → Tests automatically: Submitting a PR triggers a pipeline that uses LLMs to analyze changes and synthesize suites (ex. personas like seniors, or categories like authentication) and tests that are often missed by humans (ex. edge cases, race conditions)
Multi-suite execution: Up to 4 suites run concurrently on pre-configured VMs that computer-use agents can navigate via Cua
Evidence and traceability: Video recordings and condensed step logs for each test to facilitate bug reproduction and fixes
API-driven: FastAPI service to run single suites or full results; simple to integrate with any CI provider

System Flow

CI calls the pipeline for the PR
Pipeline analyzes PR diff + codebase summary to generate scenarios
Scenarios are saved to Supabase as:
- results (one per PR)
- suites (categories, up to 4)
- tests (2–3 high-value tests per suite where meaningful)
Pipeline triggers the Agent Runner API (/run-result) to execute all suites
Agents interact with a browser VM, record videos, and stream steps back
The API updates tests, then aggregates to results.overall_result and run_status

Repository Structure

backend/
  agents/            # FastAPI service, agent runner, DB integration
  cicd/              # CI pipeline scripts
  tests/             # Python tests
frontend/            # Next.js dashboard
API.md               # Detailed API and DB schema
README.md            # This file

Backend References

backend/agents/main.py: FastAPI with endpoints
- GET /health
- POST /run-suite (single suite by suite_id)
- POST /run-result (run all suites for a result_id)
- POST /run-agents (submit explicit test_specs)
backend/agents/runner.py: Orchestrates agent runs, recording, DB updates
- run_suites_for_result(result_id): assigns containers and runs suites concurrently
- run_single_agent(spec): executes all tests for a suite
backend/agents/database.py: Supabase helpers for results, suites, tests
backend/cicd/qai-pipeline.js: CI entrypoint
- Generates scenarios with OpenAI (rich summary for each test)
- Writes to Supabase (results/suites/tests)
- Calls POST /run-result and verifies final state
- Mirrors files to backend/artifacts/agent/<runId>

Requirements

Python 3.11+
Node.js 18+
Supabase project
OpenAI API key
Agent compute provider (CUA) credentials
AWS S3 bucket

Environment Variables

Create a .env in the repo root (consumed by both backend and cicd):

Backend:

SUPABASE_URL / SUPABASE_KEY
CUA_API_KEY (agent provider API key)
CUA_MODEL (e.g., anthropic/claude-3-5-sonnet-20241022)
CUA_CONTAINER_1..CUA_CONTAINER_4 (names/ids of up to 4 agent containers)
AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, S3_BUCKET_NAME (optional, recording uploads)

Pipeline (CI):

OPENAI_API_KEY
GITHUB_TOKEN
GITHUB_REPOSITORY, GITHUB_EVENT_PATH (GitHub Actions)
QAI_ENDPOINT (e.g., http://localhost:8000 for API)
DEPLOYMENT_URL (target app base URL; used in summaries)
AGENT_TIMEOUT (ms; default 600000)

Database Schema

See API.md for canonical DDL. Current tables include:

results: id, pr_link, pr_name, overall_result (jsonb), run_status
suites: id, result_id, name
tests: id, suite_id, name, summary, test_success, run_status, steps (jsonb), s3_link

Setup & Run (Local)

Install dependencies

# Python backend
cd backend/agents
python -m venv .venv && . .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Node cicd
cd ../cicd
npm ci

Start the API

# From repo root
python -m uvicorn backend.agents.main:app --host 0.0.0.0 --port 8000

Run the pipeline end-to-end

# From backend/cicd
node qai-pipeline.js full
# or
node qai-pipeline.js analyze
node qai-pipeline.js test

Running Only Agents for Existing Suites

If suites/tests have already been created in Supabase (e.g., via the pipeline), you can run them by result_id:

curl -X POST "$QAI_ENDPOINT/run-result" \
  -H 'Content-Type: application/json' \
  -d '{"result_id": 123}'

Frontend Dashboard

cd frontend
npm ci
npm run dev
# open http://localhost:3000

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
API.md		API.md
README.md		README.md
change.md		change.md
flow.png		flow.png
newfeature.js		newfeature.js
sdfsj.md		sdfsj.md
task.md		task.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QAI (Hack the North 2025 Submission)

How It Works

System Flow

Repository Structure

Backend References

Requirements

Environment Variables

Database Schema

Setup & Run (Local)

Running Only Agents for Existing Suites

Frontend Dashboard

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

Profilist/QAI

Folders and files

Latest commit

History

Repository files navigation

QAI (Hack the North 2025 Submission)

How It Works

System Flow

Repository Structure

Backend References

Requirements

Environment Variables

Database Schema

Setup & Run (Local)

Running Only Agents for Existing Suites

Frontend Dashboard

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages