
People are also trading
@SirSalty N/A please.
(Mira this this is underdefined, and Ronny on Twitter who this market was based on says he thinks it was borderline, so I think this makes the most sense.)
@IsaacKing Doesn't Claude 2 fit this description? Performs better than GPT-3 on pretty much every task, and has a long enough context length?
I think this question is ill-posed and you shouldn't compare losses between models. It is technically well-defined(i.e. somebody with access to OpenAI's dataset could score the prediction accuracy of any model), but in practice people use standardized benchmarks like the recent GPT-4 release. And the actual loss numbers you see reported in charts won't be comparable.
I think you should N/A this market as not meaningful, or specify some benchmarks instead.