This will be evaluated according to the AI Safety Levels (ASL) v1.0 standard defined by Anthropic here. See this market for criteria for determining a system to be ASL-3 for the purposes of this market.
Unlike in that market, the date in question is the date of the first public report that contains credible evidence that a model is ASL-3, which may be later than the date that the model is trained, and earlier than the date that there is a consensus that that evidence should count for ASL-3.
Feel free to add new answer choices. Valid choices must be in the format YYYY QQ.
Update 2025-05-25 (PST) (AI summary of creator comment): - The six-month countdown period from the referenced market will be used.
If Anthropic makes a provisional ASL-3 assessment, the market will, by default, resolve to an answer of Q2 (for the relevant year).
This default resolution to Q2 is contingent on Anthropic not retracting the provisional assessment within that six-month period.