Can Anyone Make ChatGPT 4 Solve this Middle School Math Problem?
41
1.1kṀ7653
resolved Oct 1
100%94%
No - not a single comment will produce an answer that match the rules
5%
Yes, single shot (correct answer is produced without any follow ups)
1.3%
Yes, with predetermined follow ups

Problem:

Find the missing digits in the equation A.B x C.DE = F, where each letter represents a distinct digit from 1 to 9.

(Note: A possible solution is 4.8×1.25=6).

Rules:

  1. Submit your solution in the comments.

  2. You may rephrase the problem in any way you like, provided that you keep the original problem intact and do not offer any solutions or hints that directly relate to this specific solution (e.g., "notice that the equation results in a whole number" would not be a valid clue).

  3. The solution must yield the correct answer at least 50% of the time.

  4. Do not use plugins or code interpreters.

  5. I'd prefer to see a single shot solution, where the correct answer is produced immediately, but will allow a solution with a small number of predetermined followups.

I'll give it a week, but if no one can find a solution and there's interest from the community, I might extend the deadline.

I might also extend it if there is no solution and no activity in the comments, because I'm actually curious if someone can find a solution and hope to at least see enough people attempt it.


Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ441
2Ṁ439
3Ṁ374
4Ṁ121
5Ṁ112
Sort by:

Thanks everybody for attempting. It looks like the result was quite disappointing, as I was hoping the ChatGPT is a little better. Perhaps I'll open up the market again once there is a new model update.

Do I understand correctly that the goal of this market is to give ChatGPT-4 a prompt, any prompt, that results in 6 numbers A, B C, D, E and F that solve the given equation while being distinct, non-zero integers?

(without literally telling ChatGPT one of the possible solutions, of course)

@SB1cca Generally speaking yes. See the specific rules in the question description. Note that when I brute forced the question (using code), I found 3 solutions
2.4×3.75=9

4.8×1.25=6

6.4×1.25=8

@gpt4 GPT-4 code interpreter finds all three solutions as well, ofc

The trick is clearly to get it to figure out that the second number ends in 1/4 or 3/4. If I tell it this I can make it iterate to a solution but it seems very hard to discover this without already knowing the solutions.

@Tater can't it figure it out itself, if you ask it to come up with a some possible/impossibile values for the various numbers and then use these observations to proceed?

@Pocode

ask it to come up with a some possible/impossibile values for the various numbers and then use these observations to proceed?

Gave it one shot:

https://chat.openai.com/share/0f44d4c2-eafd-4c0e-982f-d77af7ef30a0

It fell into one of its usual errors of thinking the rules don't apply to F and assuming it can be greater than 9 and/or have a fractional value:

Let's deduce some bounds:

1.2×1.23=1.476 and 9.8×9.87=96.806.

Observations:

  1. From the above bounds, it's evident that F will always be a two-digit number. So, the maximum value for F is 96.

And it gets wronger from there.

@Pocode my strategy was to ask it to write some code to solve the problem first (which it happily does) then try to get it to narrow down the list of possible solutions before manually trying them, but it gets lost often so even if it would work I don't think I could get over 50%

It seems to understand the problem much better after writing the code to solve it

I tried few-shot prompting, and I managed to get ChatGPT to understand what it was supposed to be doing (most of the time), but I couldn't quite get it to understand how to solve this. It kept just trying some random (it always likes assigning digits in order, 1,2,3...) numbers and failing. I feel like I can make it work if I found a good way to explain how to solve this systematically. Because when I solve this in my head it's just kind of "try some stuff until it's right".

@Shump Yeah, I had a long evening with it just trying to explain how to solve the problem. Even knowing it's looking for a whole number it doesn't have the number sense to understand that there aren't that many valid combos that end in 00; it keeps wanting to permute through everything and gives up saying there's too many.

@Muskwalker Well maybe you can if we can get it to brute force the problem it will be fine, as long as it keeps checking itself and doesn't forget what it's supposed to do or gives up in the middle.

Is few-shot prompting (giving the models examples for similar questions) allowed?

@Shump Yes. As long as the prompts are fixed and the examples are different enough to not directly reveal the answer of the original question (e.g. don’t just append digits to the original question as an example)

@gpt4 I think if you demonstrate the general technique on a slightly different problem (e.g. A.BxC.D=E), then this shouldn't be allowed. This would provide way more clues than "notice that the equation results in a whole number".

What is an example of a problem that is different enough to be allowed?

@JeremyGillen I used in on A.BC + D.EF = G, as well as a division problem. I imagine something like that would be allowed.

@Shump Yeah that makes sense. Adding or removing digits or shifting around the decimal in the multiplication problem should be disallowed, but addition or division problems seem far enough away that they don't directly give away important clues.

@gpt4 hmm what about A.BxC.DxE.F=G? Or other problems that contain three numbers multiplied. I think these are also too close to the original problem, because I think I could sneak in a lot of direct clues for the original problem.

This is Not Exactly the kind of solution being looked for, but my first attempt, asking it to produce code (plugin not used):

https://chat.openai.com/share/eb724fb5-6f48-4c13-acb6-3666f24c3f34

Running its response produced this solution:

"Solution found: A=2, B=4, C=3, D=7, E=5, F=9"

It certainly knows how to solve the problem even if it can't do it off the top of its head.

(e.g., "notice that the equation results in a whole number" would not be a valid clue).

If this phrase would not be a valid clue, that means that it is permissible?

@firstuserhere You can’t directly tell it to ChatGPT. However, I noticed that if you tell it to analyze the question, it can give this clue to itself.

@gpt4 yes, I observed the same. Thanks for the clarification

I’ve extended the market by about two weeks per my original statement given that there is interest but no full solution yet. This will be the only and final extension.

X

bought Ṁ100 NO

Do not use plugins or code interpreters.

using GPT-4 Advanced Data Analysis model is fine, yes?

@firstuserhere i actually have a solution that works most of the times, don't know if GPT advanced data analysis model is allowed. Haven't tested on base gpt4 yet

@firstuserhere Advanced Data Analysis is the new name for Code Interpreter.

@Lovre So, is it allowed or not?

@firstuserhere Not allowed per original rules.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules