Skip to main content
MANIFOLD
Will AI Research Be Mostly Autonomous By June 1 2027?
35
Ṁ100Ṁ4.1k
2027
24%
chance

I have a $1000 bet with Geby Jaff, founder of Archivara, on this market. I am on the NO side, Geby is on the YES side. I will trade this market, but I will resolve according to the votes of the judges. The loser of the bet agrees to pay the winner $1000.

Resolution criteria: A panel of 3 pre-agreed judges reviews the top 20 AI research papers published in December 2026. For each paper, judges assess: "Was the research primarily conducted by giving an AI agent a single high-level research goal (e.g. 'Can narrow finetuning produce broadly misaligned LLMs?'), after which the agent autonomously executed the full pipeline: literature review, hypothesis formation, experiment design, code implementation, running experiments, debugging, analyzing results, and writing up findings, with the human only stepping in to approve, reject, or minimally redirect at natural checkpoints?"

The key test: could the human's total hands-on contribution be condensed into fewer than 10 short natural-language instructions across the entire project? If the human needed to do dozens of back-and-forth iterations, debug alongside the agent, or make substantive intellectual decisions at each step, that's a NO, even if the agent did 80% of the typing.

What YES looks like: "Investigate scaling laws for mixture-of-experts on code tasks" → human walks away → agent comes back hours later with experiments run, results plotted, and a draft paper. Human reviews, says "also try it on math," agent goes again.

What NO looks like: Human uses Claude Code extensively but is constantly steering, rewriting prompts every few minutes, catching errors, redesigning experiments after seeing results, and making real scientific judgment calls throughout.

Resolves YES if judges rule 10+ of 20 papers were produced this way.


Judges:

@SemioticRivalry
TBD
TBD

Market context
Get
Ṁ1,000
to start trading!
Sort by:

15% is way overvalued lol

How can the judges verify how heavily an AI was used in the research process?
How are you determining "top 20 papers"?
Will the authors of the papers be self reporting how much AI was used, or will the judges be inferring from looking at the papers?

sold Ṁ29 YES

Gah, I accidentally bought yes and thought I sold off - hello, gold league!