METR tracks the maximum "task length" (for humans) of tasks that frontier AI systems can complete with 50% success. This task length is viewed as an important measure difficulty for AI systems since mid-late 2025 AI systems seem to struggle with autonomy over long time scales. So far, METR has projected an (approximately?) exponential growth in task length.
I (Cole Wyeth) expect that that this exponential will become a sigmoid in the near term, while Daniel Kokotajlo (an author of AI 2027) seems to expect exponential or super-exponential growth.
We placed a related 250 USD bet here: https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=gvRoTZuKZKGhbiEWA
I will attempt to interpret any reasonable ambiguities in the above terms in Kokotajlo's favor - as follows:
Kokotajlo wins if METR's frontier 50% task length is greater than or equal to 8 hours for any model which finishes training by March 31st 2027, whether or not that model is publicly available at that time. Unless it looks close I'll resolve the market (and the bet) 2 or 3 months later - open to changing this if Kokotajlo thinks it is too quick.
Even if the bet resolves early, I am not bound to pay Kokotajlo until March 2027 (otherwise the bet is not truly 50:50), though I may pay early for convenience at my discretion. However, if Kokotajlo wins I will immediately resolve this market.