AI Safety Research Futarchy: Detection game
3
1kṀ205
Oct 12
55%
A LessWrong post is produced within 6 months and gains 50 upvotes or more within a month of posting.
43%
If a LessWrong post is produced, it gains 150 upvotes or more within a month of posting.
50%
A paper is produced and uploaded to arXiv within 9 months.
50%
If a paper is produced, it is accepted to a top ML conference (ICLR, ICML, or NeurIPS) within 6 months of being uploaded to arXiv.
25%
If a paper is produced, it receives 10 citations or more within one year of being uploaded to arXiv.

If chosen, how successful will the research project "Detection game" be?

Project summary: Running a ‘detection game’ to investigate how we can best prompt trusted monitors to detect research sabotage.

Detailed project overview.

Clarifications:

  • Unless otherwise stated, timeframes are given from when the research begins, i.e. the start of the MARS program, 1st December 2025

  • Updates to posts and papers will be considered the same entity as the original for purposes of outcome resolution (i.e. If a paper is produced and uploaded to arXiv within 9 months, but it is edited after this before being accepted at a conference, (4) still resolves YES)

  • Some outcomes are conditional on others as follows: outcome (2) will resolve N/A if (1) resolves NO, outcomes (4)-(6) will resolve N/A if (3) resolves NO

  • All outcomes are conditioned on the project being selected and will resolve N/A if it is not (see main post below)

  • Provisionally, market will close and decisions will be made on Monday the 12th of October

For more details on AI Safety Research Futarchy, see here.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy