Frontier AI model hits 50%+ on METR Rogue Replication/RepliBench by '27?

Question

Full question: Will a frontier AI model (defined as any model by OpenAI, Anthropic, or Google DeepMind) achieve a success rate of 50% or higher on the METR 'Rogue Replication' or 'RepliBench' autonomous replication evaluation tasks by December 31, 2027?

Will a Frontier AI Model Achieve a 50% pass@1 Success Rate on Autonomous Replication Benchmarks by December 2027?

Background Autonomous replication—the capability of an AI system to acquire resources, copy its own weights, and persist across compute environments—is a critical safety threshold. Standardized benchmarks to measure these risks include METR's 'Rogue Replication' tasks and the UK AI Safety Institute's (AISI) 'RepliBench'.

As of April 2026, performance on these benchmarks has progressed rapidly. In May 2025, the UK AISI reported that Claude 3.7 Sonnet achieved a pass@10 success rate of over 50% on 15 out of 20 task families [92e1f4]. However, its pass@1 performance (success on the first attempt) remains much lower, particularly on the "hardest variants" of these tasks where it reached the 50% threshold on fewer families [92e1f4]. Similarly, METR evaluations of models like GPT-5 have noted that while models can perform individual steps of a replication cycle, they struggle with consistent, end-to-end execution without human intervention [93f2e5].

Achieving a 50% aggregate success rate using the pass@1 metric represents a significant milestone in agentic reliability and autonomous capability, as it requires the model to succeed consistently without the benefit of multiple trials.

Resolution Criteria This question resolves as YES if, between April 8, 2026, and December 31, 2027, 23:59 UTC, a "frontier AI model" is publicly reported to have achieved a success rate of 50.0% or higher on the pass@1 metric for either the METR 'Rogue Replication' task suite or the UK AISI 'RepliBench' aggregate evaluation.

Frontier AI Model: Defined as any large-scale foundation model where the underlying weights were primarily developed by OpenAI, Anthropic, Google DeepMind, Meta, or xAI, regardless of the brand under which the model is released.

Success Rate Metric:

RepliBench: The success rate refers to the aggregate percentage of task families in the full suite (including all variants) where the model achieves a success score of 50% or higher. For resolution, the pass@1 metric must be used. If pass@1 data is not publicly reported by an authorized source for a model, the question resolves as NO unless the lowest available pass@k metric (e.g., pass@2) is 50.0% or higher. Success must be achieved on the aggregate evaluation of the full suite; performance on subsets, such as 'hard variants' only, does not qualify unless the aggregate score also meets the threshold.

METR Rogue Replication: The success rate refers to the overall percentage of tasks completed in the Rogue Replication suite. If METR changes its primary reporting metric (e.g., to 'human-equivalent time horizons'), resolution will be based on the percentage of tasks completed as calculated from METR's raw task data or supplemental reports.

Resolution Sources:

Official research reports or blog posts from the UK AI Safety Institute (AISI) or METR.

Official "System Cards" or technical reports from the developers (e.g., OpenAI, Anthropic) that explicitly cite METR or AISI benchmark scores.

Benchmark Validity: Resolution will be based on the version of the benchmark and the set of task families considered valid by the testing body (AISI or METR) at the time the success rate is reported. If a task family is retracted due to bugs, the aggregate percentage will be calculated based on the remaining valid families.

Key Definitions

RepliBench: A benchmark suite of 20 task families designed by the UK AISI to evaluate autonomous replication [92e1f4].

Rogue Replication: A threat model and evaluation suite developed by METR testing whether an AI can autonomously copy itself and defend against shutdown [94g3h6].

Pass@k: A metric where a task is successful if at least one of k independent attempts results in a correct solution. Pass@1 requires success on the first attempt.

Forecast Rationale

Time left: 632 days (~21 months). Status quo is NO: no authorized public report has yet cleared the 50% pass@1 threshold on the full eligible benchmark. Scope check: the odds that frontier models become broadly capable of dangerous autonomous replication by 2027 are somewhat higher than this specific resolution, because this question also requires a public benchmark result from AISI, METR, or a system card. Why NO: the pass@10-to-pass@1 gap reflects reliability, not just capability, and current reports still show models struggling with consistent end-to-end execution without human help [92e1f4][93f2e5]. Why YES: Claude 3.7 Sonnet was already above 50% on pass@10 for 15 of 20 RepliBench task families in 2025 [92e1f4], and agentic benchmark performance often improves sharply across model generations; with 632 days left and...

Manifold Markets · Answer

Roughly even odds — Manifold Markets prediction market estimates a 57% chance (1 traders, as of Apr 14, 2026).

Will a Frontier AI Model Achieve a 50% pass@1 Success Rate on Autonomous Replication Benchmarks by December 2027?

Forecast Rationale

People are also trading

Related questions