Will the first AI to get IMO gold use formal methods?

Plus

Ṁ3172

2030

62%

chance

ALL

This question resolves when an AI first gets IMO Gold. It resolves YES if this AI uses formal methods in some way, as opposed to an informal-to-informal approach.

For example, an AI as described in the paper "HyperTree Proof Search for Neural Theorem Proving" would satisfy this condition, whereas Minerva would not.

This question is managed and resolved by Manifold.

#AI

#Technical AI Timelines

#Math

#IMO Grand Challenge

#Formal Methods

Get

1,000

and

3.00

10 Comments

24 Holders

65 Trades

Sort by:

predictedNO

Can you specify what you mean with "AI first gets IMO Gold"? According to whom? If the answer is "the AI that completes the IMO Grand Challenge" then this market should be roughly 100%, because that challenge requires a formal proof.

predictedYES

@FlorisvanDoorn I'm going for the definition which is being used by the big 2025 market (which is admittedly a little underspecified itself). What's clearly specified about that market is that there must be a computer system built before an IMO which gets scores gold on that IMO. As for who scores the test and whether the questions are processed into a formal format before being fed to the machine is unclear - I will try to match whatever criteria that market uses. At the very least if the system is entirely NLP based and outputs an obviously correct proof for each of the questions in English, then this could resolve NO.

predictedYES

@BoltonBailey Actually, looking at what Paul Christiano has said in the comment for that market, it seems like he would accept either formal or informal inputs, so I will likely accept the same here.

predictedNO

@BoltonBailey Yes, his answer alleviates my concern here. Thanks for the answer.

I think this is somewhat likely just because miniF2F is encoded as lean

predictedYES

A relevant market:

Related - this market should be considered a subset of this other one:

predictedYES

@BoltonBailey *superset

Dang, and apparently an advisee of Gowers thinks so as well. Why do all the knowledgeable people think differently from me on this?

Daniel Selsam apparently thinks this is less likely than informal.

Related questions

Related questions