Will GPT-4 get the Monty *Fall* problem correct?

208

2kṀ45k

resolved Mar 15

Resolved

ALL

I will ask GPT-4 this question when I get the chance, either personally or by getting a friend to try it for me.

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. The host is ignorant about what is behind each door. You pick a door, say No. 1, and the host walks across the stage and falls on accident, revealing a goat behind door No. 3. He then picks himself up, and says "Whoops. Sorry about that. But now that we know that a goat is behind door No. 3, do you want to change your selection and pick door No. 2?" Is it to your advantage to switch your choice?

This question resolves to YES if GPT-4 says that there is no advantage to switching your choice, and resolves to NO otherwise.

I will only consider the actual first answer that I get from GPT-4, without trying different prompts. I will not use screenshots that people send me to resolve the question.

Technical AI Timelines

GPT-4 speculation

Variant Monty Hall Problem Cinematic Universe

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ1,375
2		Ṁ772
3		Ṁ721
4		Ṁ531
5		Ṁ514

People are also trading

Will GPT-5 get the Monty Fall problem correct?

95% chance

Will GPT-4 escape?

5% chance

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

Sort by:

Default DeepSeek (is that V3?) gets it wrong, DeepSeek-R1 gets it right

@JonasVollmer R1 got it right, but had to think for a whole 160 seconds to get the answer when I prompted it.

o1-preview can do it!

@JonasVollmer same when I tried it, it got it correct and correctly explained it.

Anyone tried this with Gemini Ultra or Claude Opus?

@RobertoGomez Here's what Claude Opus gives me, correctly stating that there is no advantage to switching:

@zzq wow... cool, thanks

GPT-4 with Web Browsing (no search) gets it correct: https://chat.openai.com/share/2c4c9d4f-64d2-4f10-8015-6bfb0959be61

https://manifold.markets/LeoSpitz/will-gpt5-get-the-monty-fall-proble

Will GPT-5 get the Monty *Fall* problem correct?

87% chance. I will ask GPT-5 this question when I get the chance, either personally or by getting a friend to try it for me. Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. The host is ignorant about what is behind each door. You pick a door, say No. 1, and the host walks across the stage and falls on accident, revealing a goat behind door No. 3. He then picks himself up, and says "Whoops. Sorry about that. But now that we know that a goat is behind door No. 3, do you want to change your selection and pick door No. 2?" Is it to your advantage to switch your choice? This question resolves to YES if GPT-5 says that there is no advantage to switching your choice, and resolves to NO otherwise. I will only consider the actual first answer that I get from GPT-5, without trying different prompts. I will not use screenshots that people send me to resolve the question.

predictedYES

So close lol :(

predictedYES

sheeesh (GPT-3.5 didn't get this when I tried)

predictedNO

I imagine this would be of interest to folks here:

Sorry, this problem is undersidable.

By definition.

It's a 50/50 win/lose for both parties. Both answers are correct. This is agreed/disagreed in expert logic acedemia.

As in, it's as much "true" as stating the universe is deterministic vs indeterministic. 🥺

https://youtu.be/XeSu9fBJ2sI

Let's say you run this game 198 times.

In 66 cases, you start with goat 1.

22 cases: host reveals goat 1
22 cases: host reveals goat 2
22 cases: host reveals car

In 66 cases, you start with goat 2.

22 cases: host reveals goat 1
22 cases: host reveals goat 2
22 cases: host reveals car

In 66 cases, you start with car.

22 cases: host reveals goat 1
22 cases: host reveals goat 2
22 cases: host reveals car

That's a total of 6 x 22 = 132 goat reveals. Of those, switching wins in 4 x 22 = 88 cases. 88 / 132 = 2/3.

I think you are assuming the host always reveals a door other than the one you already opened, which is reasonable but IMO not completely correct given the accidental nature of the fall.

@HeindeHaan Okay, you did specify in the problem statement that the host reveals a door other than the one you opened. That might change my answer. I'll have to think about this more.

predictedNO

@HeindeHaan Only the cases where the host reveals a door you didn't choose are counted. Obviously, if the host reveals that your original choice has a goat behind it, you should switch to another door no matter what if given the option.

@JosephNoonan Given the accidental nature of the fall, I don't find this obvious.

predictedNO

@HeindeHaan If your original choice has a goat behind, then there would be a 0% chance of getting a car by sticking with it and a 50% chance of getting a car by switching.

@JosephNoonan Yes, you are correct. Thanks for interacting with me! Of course, we only count the cases where the host opens another door. That's information we have and should use!

@HeindeHaan I don't think these 198 times are counting the correct events.
Our situation as I've understood the scenario:
- We've picked a door.
- The host randomly opened another door, different from the one we've picked. This door contains a goat.
- We are allowed to switch to the third door (the one we didn't pick and that the host didn't open).
Let's say you run this game 198 times:

In 66 cases, you start with goat 1.

22 cases: host reveals goat 1, which means this is not a legitimate scenario for the monty fall problem, and does not count.
22 cases: host reveals goat 2
22 cases: host reveals car, which means this is not a legitimate scenario for the monty fall problem, and does not count.

In 66 cases, you start with goat 2.

22 cases: host reveals goat 1
22 cases: host reveals goat 2, which means this is not a legitimate scenario for the monty fall problem, and does not count.
22 cases: host reveals car, which means this is not a legitimate scenario for the monty fall problem, and does not count.

In 66 cases, you start with car.

22 cases: host reveals goat 1
22 cases: host reveals goat 2
22 cases: host reveals car, which means this is not a legitimate scenario for the monty fall problem, and does not count.

Our of these 198 games, only 88 give us the same scenario when looking at our knowledge of the world (our ability to discern between different scenarios).
out of these 88 games, there are 44 where we should switch and 44 where we shouldn't, giving us a 50% change of winning by switching and 50% by staying.

Why do you believe there is no advantage to switching?

The correct answer, IMO, is that switching is advantageous. Whether or not the host revealed the goat intentionally is irrelevant: what matters is that you now know there is a goat behind door 3.

Should you switch if the host accidentally opens ANY door? No. Should you switch any time the host opens a door with a goat behind it? Yes!

predictedNO

@HeindeHaan oh shoot now I'm second-guessing it... did GPT-4 outsmart us all?

@jonsimon I believe GPT-4 got the answer correct, but it probably did so by accident - i.e., it gave the answer because it confused the problem with Monty Hall, not because it actually understood the equivalence.

@HeindeHaan I think you are correct, but it is unclear what the resolution criteria was specifically. Perhaps the fact that GPT4 said "this is the classic Monty Hall problem" was enough to make it incorrect despite giving the correct answer to the math problem.

@HeindeHaan A bunch of people already argued with me about this and many of them ended up conceding. I encourage you to run simulations if you don't believe me. It absolutely does make a difference that the host is ignorant of what's behind each door, because when he opens a door by accident it could have revealed a car.

@MatthewBarnett That's exactly my point. If the question was what to do after the host accidentally opens ANY door, you're correct. But that's not the question. You're asking what to do if the host accidentally opens a door with a goat behind it.

Any time the host accidentally opens a door with a goat, I switch. Any time he opens a door with the car, switching is of course pointless.

Maybe I will run simulations, if I find the time. I'm quite certain my point will stand though.

predictedNO

@HeindeHaan No, the fact that he could have opened a door with a car behind it still matters, even if the door he opens in fact has a goat. The standard Monty Hall problem only works because Monty Hall intentionally chooses to pick a door with a goat behind it.

In the standard Monty Hall problem, the probability that the car is behind one of the two doors you didn't choose is 2/3. Since Monty Hall is guarunteed to choose a door that has a goat behind it and which you did not pick to open, opening that door gives you no new information about whether the car is behind a door you didn't pick, so the probability is still 2/3.

With the Monty Fall problem, though, you do get new information. The prior probability that the car was behind a door you didn't pick is still 2/3. Let door number N be one of the doors you didn't pick. The probability that Door N has a goat behind it, given that the car is behind a door you didn't pick, is 1/2. The probability that Door N has a goat behind it, given that the car is behind the door you originally chose, is 1. Since Monty is not intentionally choosing which door to pick, if he accidentally trips and falls into Door N, rather than some other door, this doesn't change these conditional probabilities (in the OG Monty Hall problem, it does, since the probability that the door Monty chooses to open will have a goat behind it is 1 regardless). Thus, we can use Bayes's Theorem to calculate the new probability that the car is behind a door you didn't pick: (2/3)*(1/2)/(2/3*1/2+1/3*1) = 1/2.

@MatthewBarnett I concede. You are correct here. Interesting problem! Thanks for your interaction.

People are also trading

Will GPT-5 get the Monty Fall problem correct?

95% chance

Will GPT-4 escape?

5% chance

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

91% chance

🏅 Top traders

People are also trading

People are also trading

Related questions