GPT 5.2 METR time horizon

Ṁ1kṀ13k

resolved Feb 5

100%99.0%

>= 4h

0.1%

< 2h

0.1%

2h00 - 2h15

0.1%

2h15 - 2h30

0.1%

2h30 - 2h45

0.1%

2h45 - 3h00

0.1%

3h00 - 3h15

0.1%

3h15 - 3h30

0.1%

3h30 - 3h45

0.2%

3h45 - 4h

This market will resolve to the highest 50% time horizon, as reported by METR, for the first GPT 5.2 model to appear on METR's graph.

50% time horizon is a measure of AI autonomy based on the length of tasks that AI can do: roughly, it is the time that humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's "Measuring AI Ability to Complete Long Tasks" for the technical definition. Claude 3.7 Sonnet, released in February 2025, was the leading model with a 50% horizon of 59 minutes.

Left bounds inclusive, right bounds exclusive.

Time horizon could vary based on the set of tasks used to measure it, so this market will be based on the time horizon for the most comprehensive set of tasks reported by METR (as of 2025, largely software and engineering tasks). This will be ambiguous if METR stops publishing time horizons across all of their autonomy tasks and only publishes separate results for different subsets; I might N/A in that scenario.

🏅 Top traders

#	Trader	Total profit
1		Ṁ607
2		Ṁ395
3		Ṁ114
4		Ṁ114
5		Ṁ99

People are also trading

GPT 5.5 METR 50% time horizon

Sort by:

@creator extend this?

The members of the AI futures project have given an update and they appear to now be relying on the 80% time horizon length graph from METR for their predictions rather than the 50% time horizon length graph. This implies that a 50% time horizon is not enough. While I think markets for 50% time horizons are useful, I now think that more attention needs to be paid to 80% time horizon lengths.

@MaxLennartson Source: https://www.aifuturesmodel.com/#section-timehorizonandtheautomatedcodermilestone

Can we please get this market extended?

https://x.com/EpochAIResearch/status/1999585226989928650?s=20

Pretty hard to predict given that this could be codex or not or one of 4 reasoning strengths.

I'd bet over 4 hours if using x high codex, but who knows if that's what is going to be tested

https://x.com/YafahEdelman/status/2002221434270331288

@jim any idea when we're going to get METR results for these newer models?

Maybe we need a market on that too haha.

@MRME METR seems to prioritise testing OpenAI models so hopefully not too long. But, that being said, it has a backlog of Gemini 3 and Claude 4.5 Opus. So, IDK.

People are also trading

GPT 5.5 METR 50% time horizon

🏅 Top traders

People are also trading

People are also trading

Related questions