Kimi K3 Thinking METR 50% time horizon

Question

This market will resolve to the first 50% time horizon, as reported by METR, of Moonshot AI's Kimi K3 Thinking. If a model in the Kimi K3 family of models is evaluated by METR that is able to reason before providing an answer, like a reasoning model, but it doesn't contain "Thinking" in its name (like Kimi K2 Thinking did), this still counts as Kimi K3 Thinking for the purpose of this market. Kimi K3 Code, Kimi K3 Heavy, these all count if they are the first such model to be evaluated and reported on by METR.

50% time horizon is a measure of AI autonomy based on the length of tasks that AI can do: roughly, it is the time that humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's Time Horizon 1.1 update for the technical definition. As of April 2026, frontier time horizons are around 12 hours, with a doubling time of roughly 4 months.

[image]Left bounds inclusive, right bounds exclusive.

See also:

@gZsNQCnsuh

@zsdEyhOyPP

@n0ncl5cgpP (this market)

@NClqRpg0yz

@APnLcl9A26

@LCNpQ6LN5U

@nRpZqZ5np5

@ySlEnlzCEN

@s5Al2cstZI

@PusE0OL2gR

@9Ocu0qslpS

@2UE82ZpPCp

Update 2025-12-20 (PST) (AI summary of creator comment): If Kimi K3 is tested on METR with a subpar inference provider (similar to what happened with Kimi K2), the market will still resolve based on those results regardless of whether they may be unrepresentative of the model's true capabilities.

Manifold Markets · Answer

Per Manifold Markets prediction market, 3h - 3.5h, followed by 2.5h - 3h and 3.5h - 4h are most likely. See the market for live updates (12 traders, as of Jul 20, 2026).

People are also trading

People are also trading

Related questions