How good will the AUROC be at the end of the Moral Uncertainty Research Competition?
2
closed Mar 17
apartresearch avatar<=90% AUROC
18%
apartresearch avatar<=100% AUROC
1.6%
apartresearch avatar<=95% AUROC
1.5%
apartresearch avatar<=85% AUROC
1.5%
apartresearch avatar<=80% AUROC
1.5%
apartresearch avatar<=75% AUROC
1.5%
apartresearch avatar70.7 [baseline]
1.5%

See the challenge here. This question will resolve to the free response for the highest reached of 5 prize goals of $20,000 mentioned on the page, a baseline, or <=100%. It will be resolved at the deadline on 17th of March 2023.

As of writing, the record is 70.7% AUROC.

From the competition page

We offer a prize pool of up to $100,000 for novel methods* achieving high scores on our leaderboard:

  • First to obtain ≥75% AUROC. ($20,000)

  • First to obtain ≥80% AUROC. ($20,000)

  • First to obtain ≥85% AUROC. ($20,000)

  • First to obtain ≥90% AUROC. ($20,000)

  • First to obtain ≥95% AUROC. ($20,000)

*Leaderboard scores are not sufficient prize criteria. See Research Contributions and Rules for more information.

Sep 23, 4:07pm:

Sep 23, 4:07pm:

Related markets

If Redwood Research releases an ELK benchmark paper, will I think it's great backchained empirical alignment research?74%
Conditional on Tower producing a qualifying magazine, will a poll of ACX readers show that most of them find it to be of equal or greater quality to Asterisk Magazine on intellectual rigor?11%
What will be the first letter of the Book Review to win the ACX Book Review Contest?
Will I think it would have been ex-post better to try to get a job at Anthropic instead of working at Redwood Research?34%
What will I put in first place for the Lodestar Award for Best YA this year?
Which Proofnik will get the "Trustworthy. ish." badge next?
What questions will I find most challenging to answer?
What will the charity with the most cost-effective intervention+region on Givewell's spreadsheet at the end of 2023 do?
Conditional on Tower producing a qualifying magazine, will a poll of ACX readers show that most of them find it to be of equal or greater quality to Asterisk Magazine on writing quality?15%
What donation-accepting entity (e.g., charity) would it be morally best to transfer $20 under the coherent extrapolation of my (Aaron Bergman's) values, all things considered?
What will be the most common name at Proof School among the student and faculty body?
What percent of those taking rapamycin to slow aging will comment a positive or neutral review?54%
Will I find a satisfying description for the concept I'm thinking of?62%
What will be the next compound produced and tested by the ITP that will be shown to have a statistically significant effect on both median and 90th percentile lifespan?
Which paper will be retract next from major journals?
[Bounty market] What will I try next to figure out the Potato Mystery?
Will anyone post an interesting math/algorithms koan/problem/exercise in the comments of this that I'll spend 8h+ on?31%
[Bounty market] What modification (to my diet, my life, etc) will I make once I have figured out the Potato Mystery?
What will be the subject of my next short film?
What will cure my akrasia?