left-bracket inclusive
According to credible sources after market creation (July 31 2025)
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ204 | |
2 | Ṁ199 | |
3 | Ṁ56 | |
4 | Ṁ36 | |
5 | Ṁ26 |
People are also trading
so what was it
oh, it was a different version of GPT-5
@Bayesian What's the deal for this market? It's doubtful we're going to be seeing any new info.
We have https://openrouter.ai/announcements/gpt-5-is-now-live which says "Earlier iterations of this model were available as Horizon Alpha and Horizon Beta" and https://x.com/OpenRouterAI/status/1953522983084999001 which says "GPT-5 replaces the Horizon Alpha and Beta stealth models on OpenRouter. They were early checkpoints in the GPT-5 family of models."
But looking at the various benchmarks it seems clear that the Horizon models were not GPT-5-nano:
GPT-5-nano benchmarks according to rankedagi.com:
SWE-Bench Verified: 54.7%
Aider Polyglot: 48.4%
GPQA Diamond: 71.2%
AIME 2025 I&II: 85.2%
Svelte Bench: 16.7%.
It's hard to find benchmarks for the Horizon models but it seems they do much better. Note that they did well on Svelte Bench specifically: https://x.com/khromov/status/1952995621830369504
GPT-5-mini gets 21.1% on Svelte Bench while GPT-5 gets 78.9%. Note that although GPT-5 benchmarks are mixed up with GPT-5-Thinking vs non-thinking and Horizon models were non-CoT, which likely explains the ways in which Horizon models outperformed GPT-5, Horizon models also underperformed GPT-4o in many areas, so I'm sure the answer is more complex.
But it seems like there is ~0 chance that the horizon models were GPT-5-nano.
Thoughts?
@Bayesian they have very different outputs at similar parameters and have very different scores, so no
See related market: https://manifold.markets/satchlj/what-model-is-horizon-alpha-on-open