Grok 3.5 ‘leaked’ benchmark scores end up real?
28
1kṀ9268
Jun 3
9%
chance

This was shared across twitter.

Will it be confirmed real or completely made up? If they announce benchmark results that are all better than 1% less than the current leaked results, this market resolves YES.

It also resolves yes if the benchmark results were obtained with pass@1024 or something like that

Get
Ṁ1,000
to start trading!
Sort by:

Do the results count as real if they are the result of juicing it with extremely high levels of inference time compute expenditure (like with o3 preview with >$1000 per query) .

@Damin Yeah!

the original resolution criteria was

Will it be confirmed real or completely made up? If they announce benchmark results within 1% of each of these except one which can be within 2%, even if they announce the results were for pass@64 or any other not ‘apples to apples’ comparison like that, the market resolves yes.

But when i updated it i accidentally removed the part that would have answered your question

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules