MANIFOLD
Best AI time horizon by August 2026, per METR?
51
Ṁ2kṀ25k
Oct 31
0.9%
<6 hours
0.8%
6 to 8 hours
0.8%
8 to 12 hours
7%
12 to 16 hours
29%
16 to 24 hours
62%
>=24 hours

This market will resolve to the highest METR 50% time horizon for any AI model released by August 31, 2026, closing after a two-month buffer period. Left bounds inclusive, right bounds exclusive.

50% time horizon is, roughly speaking, the time that skilled humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's "Measuring AI Ability to Complete Long Tasks" for more. As of August 2025, the longest 50% time horizon is GPT-5's 2h 17 min.

For reference, the buckets in this market correspond to doubling times of roughly:

  • 6 hours: 9 months

  • 8 hours: 7 months

  • 12 hours: 5.3 months

  • 16 hours: 4.5 months

  • 24 hours: 3.7 months

given that GPT-5 has a time horizon of 2.28 hours and was released in early August 2025.

Time horizon estimates will vary based on the set of tasks used, so this market will be based on the "headline" result reported by METR. METR currently uses a composite of the HCAST, RE-Bench, and SWAA benchmarks. There is a good chance that they extend this set with harder/longer tasks at some point. If METR no longer publishes a headline result, and their future evals are based on substantially different benchmarks so that it is difficult to compare to their mid-2025 estimates, then this market may become ambiguous and resolve N/A.

Market context
Get
Ṁ1,000
to start trading!
Sort by:
bought Ṁ50 YES

@Bayesian free manas here

bought Ṁ50 YES

@jim oh wtf

@Bayesian oops I made some silly bets pre-Opus4.5/ClaudeCode takeoff

pre-the straight line continuing to be straight

What did he mean by this /s

Tbf a lot of the change is V1.1 being easier

@Bayesian I mean, I think the best time was ~3.5 hours when I made these bets in November, to be fair

@Bayesian oh really, do you think 1.1 is meaningfully easier? Seems like it's only worth like 10-20-ish minutes?

The long time horizon problems that were added are a bigger and bigger influence on the measured time horizon over time so the measured growth rate is significantly faster, which under my read means it’s meaningfully easier

The members of the AI futures project have given an update and they appear to now be relying on the 80% time horizon length graph from METR for their predictions rather than the 50% time horizon length graph. This implies that a 50% time horizon is not enough. While I think markets for 50% time horizons are useful, I now think that more attention needs to be paid to 80% time horizon lengths. I am planning to create markets for 80% time horizons as soon as possible unless someone beats me to it.

@MaxLennartson please don't spam the same thing on a dozen markets but if you do please give a source / link to what you are talking about, i don't know what you are referring to

@Bayesian So sorry! I meant to give a source but I completely forgot. Source: https://www.aifuturesmodel.com/#section-timehorizonandtheautomatedcodermilestone. Let me know if you can access the link. It should work. In terms of putting my comment on multiple markets I wanted to make sure everyone could learn about the information I was seeing but I realize that I should have made sure that I gave the source. I just posted the source to the other markets so that people judge for themselves. Again I am so sorry!

I am just curious what the time horizon length would be if the doubling time is 5 months?

filled a Ṁ50 NO at 19% order

Relevant to various time horizon markets:

https://www.lesswrong.com/posts/4oCh3x6EPHomEbcDJ/nikola-s-shortform?commentId=jfv8LdK3bHCavWSDS

I very roughly polled METR staff (using Fatebook) what the 50% time horizon will be by EOY 2026, conditional on METR reporting something analogous to today's time horizon metric.

I got the following results: 29% average probability that it will surpass 32 hours. 68% average probability that it will surpass 16 hours.

The first question got 10 respondents and the second question got 12. Around half of the respondents were technical researchers. I expect the sample to be close to representative, but maybe a bit more short-timelines than the rest of METR staff.

The average probability that the question doesn't resolve AMBIGUOUS is somewhere around 60%.

bought Ṁ5 YES

he average probability that the question doesn't resolve AMBIGUOUS is somewhere around 60%.

40% likely to be AMBIGUOUS which seems pretty bad

© Manifold Markets, Inc.TermsPrivacy