Will the next OpenAI model be closer to Fable-class than to Opus-class?

Ṁ100Ṁ2.2k

resolved Jul 10

Resolved

YES

ALL

Presumably GPT 5.6 or maybe GPT 6 depending how they brand it. 5.6 seems to be the rumour right now.

Resolves based on my personal eyeballing of mostly benchmark results / vibes.

Resolve YES if it's distinctly Fable-class. Resolves NO if it's distinctly Opus-class.

Maybe resolves to 50% if it seems genuinely in the middle of the two.

N/A if there is no new release by close.

Market context

GPT-5 Speculation

AI model releases

ChatGPT

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ87
2		Ṁ46
3		Ṁ45
4		Ṁ29
5		Ṁ28

People are also trading

If Opus 5 is released by EOM, will it out-benchmark Fable 5?

20% chance

Is GPT-4.5 the base model for o3?

5% chance

Which, if any, GPT-n will outperform AlphaGeometry merely via prompting, by 2030?

In what year will a GPT4-equivalent model be able to run on consumer hardware?

2026

In what year will a GPT4-equivalent model be able to run on consumer hardware?

2026

Sort by:

@traders I expect I will wait until the full benchmark suite for 5.6 Sol is released. But at the moment this is looking very likely to resolve yes.

@eapache How do you conclude that? When I look at METR, GPT-5.6 Sol sits at Opus level with 11.3 hours @50%, substantially below Mythos Preview.

(That's with the default methodology of counting reward hacking attempts as failures. METR notes that GPT-5.6 Sol's "cheating rate was higher than any public model we have evaluated". If other benchmarks show better values in the headline numbers, I'd wonder how much of that is cheating.)

@David6LScg The (very limited) benchmarks published on https://openai.com/index/previewing-gpt-5-6-sol/ sit around Fable/Mythos level rather than Opus level. I hadn’t seen the METR results, but they do add some uncertainty.

@eapache See the METR report here: https://metr.org/blog/2026-06-26-gpt-5-6-sol/

50% success time horizons:

GPT-5.6: 11.3 hours

Opus 4.6 (Feb 2026): 12.0 hours

Mythos Preview (Feb 2026): 17.4 hours

Note that Opus 4.8 and Mythos 5/Fable 5 have not been measured yet. The above are older Claude versions from February, and there even Opus is scoring above GPT 5.6 Sol, with Mythos Preview in the range >16 hours which METR notes as saturated for their testing suit.

If OpenAI has not controlled benchmark scores for GPT-5.6's egregious reward hacking, they might not represent capability as much as ability + propensity to cheat.