Will OpenAI's next major LLM (after GPT-4) solve more than 2 of the first 5 new Project Euler problems?

Question

Background: Project Euler is a series of challenging mathematical/computer programming problems intended for computational problem-solving using computer algorithms. Each problem requires more than just mathematical insights to solve; it often requires the design and implementation of efficient algorithms. This benchmark assesses the problem-solving capabilities of LLMs in fields requiring high levels of mathematical and algorithmic understanding.

Question: Will the next major release of an OpenAI LLM solve more than 2 of the first 5 new Project Euler problems released after the model’s official public debut?

Resolution Criteria: For this question, the "next major release of an OpenAI LLM" is defined as the next model from OpenAI that satisfies at least one of the following criteria:

It is consistently called "GPT-4.5" or "GPT-5" by OpenAI staff members

It is estimated to have been trained using more than 10^26 FLOP according to a credible source.

It is considered to be the successor to GPT-4 according to more than 70% of my Twitter followers, as revealed by a Twitter poll (if one is taken).

This question will resolve to "YES" if this LLM successfully solves more than 2 of the first 5 Project Euler problems released post its launch, according to the first single public document or comment describing an attempt to get the LLM to solve each of these problems as follows. For each problem, the LLM will be allowed up to three attempts to provide a correct solution, with a total time limit of 3 hours of computational 'thinking' across all attempts. Network errors resulting in partial responses will not be counted. The resolution will rely on public documentation or a credible report detailing the performance of the model on these specific problems.

Manifold Markets · Answer

Likely — Manifold Markets prediction market estimates a 63% chance (20 traders, as of Dec 12, 2025).

People are also trading

Related questions