Will o1 (not preview) achieve a better score on LiveBench coding than Claude 3.5 Sonnet 10/22?
Basic
5
Ṁ261
resolved Dec 14
Resolved
NO

Per LiveBench.ai Claude 3.5 Sonnet achieves 67.13 while o1-preview gets only 50.85.

Resolves when o1 is added to the LiveBench leaderboard

  • Update 2024-11-12 (PST): Market will resolve based on API results from LiveBench, not manual additions to the leaderboard. (AI summary of creator comment)

Get
Ṁ1,000
and
S3.00
Sort by:

I think I can resolve this no because o1's coding was manually added to livebench. Or should I wait for the API results?

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules