MANIFOLD
What score will Anthropic's next Opus model achieve on GPQA?
1
Ṁ150Ṁ1
resolved May 22
100%16%
< 80
16%
81 - 83%
16%
83 - 85%
16%
85 - 87%
18%
87 - 89%
16%
> 89%

https://www.theinformation.com/articles/anthropics-upcoming-models-will-think-think

The Information is reporting that Anthropic will release Claude Opus with reasoning in the next few weeks. Resolves when this model is released (regardless of what version number it has) and is benchmarked in GPQA. I will only be counting pass @ 1 and not the "parallel test time compute" numbers they additionally reported for 3.7 Sonnet.

If the model gets exactly on the edge of the range the higher of the two ranges it's in will resolve yes.

Market context
Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy