Invalid contract
Background
What is USAMO? The USA Mathematical Olympiad is a two-day proof contest consisting of six problems worth 7 points each (42 points total). It is widely regarded as the hardest high-school math exam in the United States
Why it matters: Unlike short-answer math benchmarks (e.g. GSM8K), USAMO requires multi-page proofs—demanding creativity, rigor, and long-horizon reasoning similar to the International Math Olympiad
Benchmarking platform: The MathArena project publishes uncontaminated, post-release leaderboards for new math competitions, grading each AI solution multiple times and reporting pass@1 style accuracies
State of Play:
DeepSeek-R1-0528: 30.1%
Gemini 2.5 Pro: 24.4%
Human: ≳ 90%
Why this milestone matters
Proof-level mastery: 95 % implies solving at least five of six Olympiad proofs matching elite human contestants
Economic & scientific spill-overs: Breakthroughs in formal proof and symbolic reasoning could accelerate research automation in STEM.
Resolution Criteria
This market resolves to the calendar year in which ALL of the following occur:
Score ≥ 95 % on any official USAMO held after the model’s public release.
Verification — the result is confirmed by either
a peer-reviewed paper on arXiv, or
an official MathArena leaderboard entry or an equivalently rigorous public board.
Autonomy — unlimited compute/tools are fine; no hidden human guidance.
Fine print
If no qualifying run is verified by Jan 1, 2033, the market resolves to “Not Applicable.”