This market will resolve to the highest 50% time horizon, as reported by METR, for any Gemini 3 model released within a month of the first Gemini 3 announcement.
50% time horizon is a measure of AI autonomy based on the length of tasks that AI can do: roughly, it is the time that humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's "Measuring AI Ability to Complete Long Tasks" for the technical definition. Claude 3.7 Sonnet, released in February 2025, was the leading model with a 50% horizon of 59 minutes.

Left bounds inclusive, right bounds exclusive.
See also:
/jim/claude-45-opuss-metr50-horizon (jim's version)
/Bayesian/claude-opus-45s-metr50-time-horizon (my version)
/Bayesian/gemini-3s-50-time-horizon-per-metr (this market)
/Bayesian/grok-420s-metr-50-time-horizon
/Bayesian/claude-sonnet-46s-metr-50-time-hori
/Bayesian/grok-5s-50-time-horizon-per-metr
/Bayesian/r2s-50-time-horizon-per-metr
/Bayesian/kimi-k3-thinkings-metr-50-time-hori
Update 2025-12-21 (PST) (AI summary of creator comment): If METR does not evaluate any Gemini 3 model released within a month of the first Gemini 3 announcement, this market will resolve N/A.
Update 2025-12-22 (PST) (AI summary of creator comment): If METR evaluates a general access (GA) version of Gemini 3 instead of the preview version released within a month of the first announcement, this market will resolve N/A.
People are also trading
This market will resolve to the highest 50% time horizon, as reported by METR, for any Gemini 3 model released within a month of the first Gemini 3 announcement.
@traders just in case you are not aware, if METR ends up not evaluating the current version of gemini 3, this market will unfortunately resolve N/A. I think it would be worthwhile to create a version that resolves to whatever version does get evaluated though (edit: https://manifold.markets/Bayesian/gemini-3-pro-metr-50-time-horizon)
This market will resolve to the highest 50% time horizon, as reported by METR, for any Gemini 3 model released within a month of the first Gemini 3 announcement.
@traders just in case you are not aware, if METR ends up not evaluating the current version of gemini 3, this market will unfortunately resolve N/A. I think it would be worthwhile to create a version that resolves to whatever version does get evaluated though (edit: https://manifold.markets/Bayesian/gemini-3-pro-metr-50-time-horizon)
@Bayesian I think they will evaluate this model. They implied that they would do so soon. However, it sounds like they are doing other models first. Given that Gemini came out in November and that METR is waiting for general access, they probably won’t evaluate the model until sometime in January. I predict that METR will evaluate GPT 5.2 first followed by Grok 4.1 and then Gemini 3.
@MaxLennartson Gemini 3 Pro is in preview, not GA. If they wait for GA to evaluate, and Google deprecates inference for this preview version, they might be evaluating the GA version rather than the current version (preview).
@Bayesian I personally think this market will resolve N/A because I think METR will wait for general access like they did for Gemini 2.5.
@jim I think the “within a month” thing means any model of Gemini’s released within a month of the first announcement, not METR’s analysis
@bens yes, but it's not guaranteed that any Gemini models which meet this condition will be evaluated by METR.
but if they don't then obviously it would resolve to <1.5h jk it would resolve N/A
@MaxLennartson The currently available model is gemini 3 pro preview. General access is when they remove all modifiers and sctually call the model gemini 3 pro in the api and such
@MaxLennartson They’re calling it thst to customers to keep it simple but the devs they re calling it gemini 3 pro preview
@Bayesian Do you think that METR will evaluate the ai models that have been released recently including Gemini 3?

