
Background
ARC-AGI began in 2019 as a grid-based reasoning benchmark (“v1”) intended to test whether AI can infer new rules from just a few examples rather than rely on pattern memorization. By the close of ARC Prize 2024, open-source solvers had plateaued around 53 % accuracy on the private set, while a high-compute run of OpenAI’s o3-preview model jumped to roughly 75 – 88 %, underscoring v1’s saturation.
In response, the ARC Prize Foundation unveiled the harder, human-validated “ARC-AGI-2” (v2) on 24 March 2025 and launched the 2025 Kaggle contest with a compute cap of about US $0.42 per task. The headline rule remains unchanged: the first fully open-source system to reach ≥ 85 % on the private v2 evaluation set wins the $1 Million Grand Prize. This market tracks only the year in which ARC publicly awards that Grand Prize.
Resolution Criteria
The market resolves to the calendar year in which the ARC Prize Foundation publicly announces and awards any portion of the Grand Prize to one or more teams.
Primary rule: A submission must achieve ≥ 85 % accuracy on ARC-AGI-2 (or its officially designated successor) during a formal annual competition period.
Future changes: If ARC introduces a new test or alters the accuracy threshold, the criterion remains “the first year they actually pay out—or commit in a binding announcement to pay out—the prize labelled as the ARC Grand Prize.”
Fine Print
If the resolution criteria are unsatisfied by Jan 1, 2050 the market resolves to “Not Applicable.”