Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?
9
1kṀ1563
2026
82%
chance

Resolves YES if a language model is released before January 1st 2026, that:

1. Has freely accessible weights, meaning the general public can download it and run it locally, regardless of additional restrictions.
2. Is explicitly described as having less than 10 billion parameters. (If the actual parameter number is less than 10B and was e.g. rounded up, this doesn't count as being under 10B.)
3. Achieves an Arena Score on http://lmarena.ai/leaderboard greater than the score of Claude 3.5 Haiku 2024-10-22, with both scores measured at the same point in time.

In the event that the way Arena Scores are calculated changes significantly or that specific Haiku model is no longer ranked before EOY2025 (I would be very surprised if this happens), I will try to find a suitable replacement criteria that traders can agree is fair. If no such criteria can be found, this market will N/A.

Get
Ṁ1,000
to start trading!
Sort by:
bought Ṁ1,000 YES

I might be missing something. how could this not happen?

it's mid, it's from months ago, there's over 10 months left to the year, haiku is probably under 100B and under 30B active params, and it's not a reasoning model

@Bayesian hmm yeah this puzzles me, do you think lmsys is not accurately assessing quality here?

@MingCat lmsys is not assessing quality very well, at least. and llms at constant size are getting much better over time

@Bayesian Yeah, I'm hopeful too. Looks like Reka Core is 67B parameters. So the question is basically just how quickly we'll see small open-source models scale. We've seen AI development show some pretty weird progress overhangs where the tech theoretically should exist, but no one's properly capitalized on it. (It took a while for DeepSeek to come along, for instance)

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules