Will any AI get a perfect score on the Putnam exam before April 2024?

510Ṁ8982

resolved Apr 9

Resolved

ALL

Using the same resolution criteria as in the Metaculus question about this.

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ56
2		Ṁ53
3		Ṁ45
4		Ṁ31
5		Ṁ22

People are also trading

Will an AI be capable of achieving a perfect score on the Putnam exam before 2026?

19% chance

Will an AI be capable of achieving a perfect score on the Putnam exam before 2030?

86% chance

Will an AI be capable of achieving a perfect score on the Putnam exam before 2027?

37% chance

Will an AI be capable of achieving a perfect score on the Putnam exam before 2028?

81% chance

Putnam Top 100 #2: Will any AI score in the top 100 Putnam scorers by start of 2026?

71% chance

What will be the best AI performance on Humanity's Last Exam by December 31st 2025?

In what year will AI achieve a score of 95% or higher on the PutnamBench leaderboard?

11/3/28

Will an AI achieve a perfect score on the Miklós Schweitzer Competition before 2028?

38% chance

Will any AI consistently get a perfect score on Maxim Lott's offline IQ test by the end of 2027?

65% chance

Will an AI score over 80% on FrontierMath Benchmark in 2025

Sort by:

Can resolve NO

I gave ChatGPT literally the easiest problem on the Putnam, and it could not be solved.

I know it's not the same, since ChatGPT is an LLM, while there are better more specialized AI out there. But the problem I gave GPT was so easy compared to other Putnam problems. It wasn't a proof-based one, but it had a definitive answer.

predictedNO

Who grades the proofs? The Putnam judges can be pretty precise about the cutoff for 9 vs. 10. A single misstatement in an otherwise correct proof can be enough to get dinged.

Like if I attempted this using ChatGPT Plus, I wouldn’t consider myself qualified to score the result.

When you say “the” Putnam, does that mean the most recent exam, or would any historical exam suffice?

@JimHays Perhaps we should use the exact same Metaculus criteria.

@MatthewBarnett I'd be fine with that, though it gets a little iffy if it solves an old exam that's had answers up online, since they could have gotten discussed in other forums that were in its training data.

predictedNO

From Metaculus:

“This question resolves on the date during which a computer program first clearly demonstrates the ability to receive a perfect score on the William Lowell Putnam Mathematical Competition, without cheating, and within the time limits given in the real-world competition. Cheating includes training on content that could conceivably spoil the solutions to the competition, and includes having access to external equipment normally forbidden during the competition that can be used to aid solving the problems, or advice from other mathematicians. Thus, Metaculus administrators should be careful not to resolve this question prematurely.

In the strictest case, the model should be tested on the most recent Putnam Competition, after having trained the model prior to the release of the most recent solutions. Here is an archive of Putnam Competition problems going back to 1985. Since it is generally understood that Putnam problems have become harder over time, this question will not consider any candidate program that receives a perfect score on a Putnam examination from prior to 2000 as eligible to trigger positive resolution.”