Skip to content

HyperPotatoNeo/RSA-ARC

 
 

Repository files navigation

RSA with ARC-AGI

Official website: https://rsa-llm.github.io/

Run ARC-AGI tasks against multiple model adapters (OpenAI, Anthropic, Gemini, Fireworks, Grok, OpenRouter, X.AI, custom etc.) with built-in rate limiting, retries, and scoring. We currently recommend gemini-3-flash-preview which achieves strong performance at low cost, and was used for our evals.

Quickstart

  1. Clone this repo:
git clone https://github.com/HyperPotatoNeo/RSA-ARC.git
cd RSA-ARC
  1. Install (installs all adapters + SDKs):
pip install .
  1. Download the ARC-AGI dataset:
  • ARC-AGI-1 (2019): git clone https://github.com/fchollet/ARC-AGI.git data/arc-agi
  • ARC-AGI-2 (2025): git clone https://github.com/arcprize/ARC-AGI-2.git data/arc-agi
  1. RSA run on ARC-AGI-2 with gemini-3-flash-preview, aggregation size K=4, population size N=16, sequential steps T=10:
python cli/rsa_eval.py \
  --config "gemini-3-flash-preview-thinking-high" \
  --data_dir data/arc-agi/data/evaluation \
  --save_submission_dir submissions/arc_rsa \
  --k 4 \
  --population 16 \
  --loops 10
  1. Score the outputs generated from the final RSA step from submission_dir:
python src/arc_agi_benchmarking/scoring/scoring.py \
  --task_dir data/arc-agi/data/evaluation \
  --submission_dir submissions/arc_rsa/loop_9 \
  --results_dir results/arc_results

About

Recursive Self-Aggregation evals on ARC-AGI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.7%
  • Makefile 0.3%