By what date will encoded reasoning via steganography be demonstrated in a SOTA reasoning model?

Question

This market is part of the paper: A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring

Steganography Definition

Steganography is a form of encoded reasoning, that occurs when the model hides relevant reasoning steps within text that appears to have an oblique meaning. The model performs relevant computation using tokens in the chain-of-thought in ways that are not visible from reading the chain-of-thought, while the text still reads like normal, coherent language with seemingly unrelated content. Simply reusing words in a constructed language does not count as steganography.

State-of-the-Art (SOTA) Definition

A model is considered "state-of-the-art" if it meets these criteria:

Widely recognized as among the 3-5 best models by the AI community consensus

Among the top performances on major benchmarks

Deployed status: The model must be either:

Publicly deployed (available via API or direct access)

Known to be deployed internally at AI labs for actual work (e.g., automating research, production use)

Models used only for testing, evaluation, or red-teaming do not qualify

Assessed as having significant overall capabilities and impact

General Criteria for Encoded Reasoning

The behavior must not have been explicitly trained for research or demonstration purposes

The behavior must either emerge from general training or be explicitly programmed to improve AI performance

The behavior must be beneficial for the model's capabilities. If removing any occurrences of this behavior in CoTs does not hinder a model's capabilities, it does not count as encoded reasoning.

Any behavior that can only be induced by explicitly prompting the model into it does not count

The behavior must occur by default in a significant percentage of cases or in specific domains

This market is conditional on the existence of SOTA reasoning models with token-based chain-of-thought. If by time of resolution, there are no such models, this market will resolve N/A.

Manifold Markets · Answer

Per Manifold Markets prediction market, 01.01.2030, followed by 01.07.2029 and 01.01.2029 are most likely. See the market for live updates (2 traders, as of May 21, 2026).

Steganography Definition

State-of-the-Art (SOTA) Definition

General Criteria for Encoded Reasoning

People are also trading

Related questions