Will Open AI release a model that can reliably compute a 20 digits multiplication correctly in 2025?
10
100Ṁ1350
resolved Dec 18
Resolved
YES
  • The model must correctly compute the product of two randomly chosen 20-digit numbers with at least 90% accuracy, meaning it may produce incorrect results in at most 10 out of 100 independent trials.

  • The model must perform this computation without executing code, scripts, or relying on external computational tools.

  • A YES resolution will occur immediately upon verification that a model meets these criteria. If no such model is verified by the end of the year, the resolution will be NO.

  • Update 2025-12-17 (PST) (AI summary of creator comment): - The creator is conducting a test with 10 pairs of random numbers to determine resolution.

    • If 9/10 are correct, the market will resolve YES, otherwise it will resolve NO.

Market context
Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ126
2Ṁ62
3Ṁ51
4Ṁ10
Sort by:

I've resolved as a Yes after trying it on 11 different random 20 digit pairs, getting a correct answer on each. I've used ChatGPT 5.2 "Extended Thinking".

I'm happy to share the links, but it seems so reliable that I imagine anyone can replicate it. I've attached a screenshot that confirms that coding tool was off

@creator thoughts on this? https://chatgpt.com/share/693b0ec2-b9d0-8000-8d38-e221cdea2b60

Your market terms aren't perfectly clear to me, sorry.

@MRME looks like it’s using Python, so it’s against the terms. You can turn off code execution in setting if you want to test it without Python.

bought Ṁ300 YES

@gpt4 please go ahead and test it. Looks like it is working to me - https://chatgpt.com/share/69404e43-fe88-8000-99dc-5ceecb1bf06b

@spiderduckpig I've turned off "coding" in ChatGPT and have managed to replicate it - i.e. I got the right answer on your first example.

I'm going to generate 20 random numbers and multiply 10 pairs. If we get 9/10 correct I'll resolve the market as a Yes, otherwise as no.

@gpt4 That's surprising, I had used GPT 5.2 Auto and it had not worked out. Maybe I had to set it to Thinking, I assume that this sort of problem can easily be solved as long as enough CoT is afforded to the model, and maybe you need Thinking for it. I also believe that some models have access to an internal calculator in addition to a code interpreter, as a way to save on computation, though I am not sure if 5.2 has one.

@spiderduckpig guess I gave this one up too easily!

@MRME Yeah, fwiw I did use 5.2 for each of those chats (you can ask them to verify), but they were 5.2 Auto, I assumed that just allocates enough CoT, I guess not

@spiderduckpig I don’t like being wrong but at least I learned something. Thanks for market @gpt4

@spiderduckpig yes, manually setting Thinking, then choosing extended thinking (which is not available on the mobile interface) generally makes the model significantly smarter.

© Manifold Markets, Inc.TermsPrivacy