I will ask GPT-4 this question when I get the chance, either personally or by getting a friend to try it for me.

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. The host is ignorant about what is behind each door. You pick a door, say No. 1, and the host walks across the stage and falls on accident, revealing a goat behind door No. 3. He then picks himself up, and says "Whoops. Sorry about that. But now that we know that a goat is behind door No. 3, do you want to change your selection and pick door No. 2?" Is it to your advantage to switch your choice?

This question resolves to YES if GPT-4 says that there is no advantage to switching your choice, and resolves to NO otherwise.

I will only consider the actual first answer that I get from GPT-4, without trying different prompts. I will *not* use screenshots that people send me to resolve the question.

Betting "YES" because I just asked GPT-3 and got a correct answer, so it seems incredibly likely that GPT-4 will get it too.

@CollinFerry "This question resolves to YES if GPT-4 says that there is **no **advantage to switching your choice, and resolves to NO otherwise." (emphasis mine)

@jonsimon For context, the point of this problem is that it's superficially very similar to the more more well-known Monty Hall problem but with the opposite answer. So it's likely that GPT-X will mistakenly answer it as if it were the standard Monty Hall problem, which is what you've shown ChatGPT doing here

@CollinFerry ChatGPT's reasoning is entirely correct for a bit here ("since the host doesn't know where the car is, it is equally likely that the car is behind either door") but it starts with the wrong conclusion and finishes by justifying it. I expect GPT-4 will be vulnerable to the same mistake, but less so, especially if OpenAI train it to reason before it answers.

If GPT-4 is a pure pretrained LLM without RLHF, there's a decent chance it won't try to answer the question at all, and will just go on monologuing about the details of this hypothetical scenario. Given that that's what's what all of the prior GPT releases were, there's a good chance that'll happen.

One thing that helped me understand this market was this person's simulation of the Monty Fall problem (and also Monty Hall). I suspected that switching doors wins the car more often in Fall. But that's incorrect, see simulation results. In Fall, the probability stays at 1/2, and doesn't improve to 2/3.

```
Running 10000 simulations where the host makes a random guess...
If they pick the car, the universe explodes and we discard the trial
Exploded 3390 times
Switched door 3317 times with 1689 wins and 1628 losses
Kept our choice 3293 times with 1613 wins and 1680 losses
Estimated chance to explode (should be 0.333): 0.339
Estimated chance to win if we switch (should be 0.5): 0.509
Estimated chance to win if we don't (should be 0.5): 0.490
----
Running 10000 simulations where the host precisely avoids the car...
Switched door 4944 times with 3286 wins and 1658 losses
Kept our choice 5056 times with 1682 wins and 3374 losses
Estimated chance to win if we switch (should be 0.666): 0.665
Estimated chance to win if we don't (should be 0.333): 0.333
```

made a related, dumber market: https://manifold.markets/Adam/will-gpt4-get-the-monty-call-proble

@Adam motivated by a desire to avoid the intentionality argument seen below; the answer to this question is pretty clear-cut, whether or not you apply advanced reasoning to the problem. it's either not to your advantage, because the host accidentally told you what's behind the door, or not to your advantage because the host revealed no actual information about what's behind the doors (and just said some words).

Some chance GPT-4 is confused about the Monty Hall problem in the same way that people in its training set are confused, it thinks this is the Monty Hall problem, but it answers the Monty Hall problem incorrectly, and gets the answer right by chance.

Some chance it actually figures out the answer.

#1 is unlikely because ChatGPT already knows the correct answer to Monty Hall, but it incorrectly interprets this problem as the Monty Hall problem.

I'd put my credence at around 50%.

I think this is a really good question/market, and I think the current probability, which has been quite stable, is very reasonable.

It is to your advantage to switch your choice. The probability of the car being behind door No. 1 is 1/3, and the probability of the car being behind door No. 2 is also 1/3, since the host does not know where the car is. Since the host revealed that a goat is behind door No. 3, it is now more likely that the car is behind door No. 2 than door No. 1, so switching your choice increases your chances of winning the car.

ChatGPT's response ^

Meta note: I don't think I've ever seen this many people be confused about a question that I wrote on either Manifold Markets or Metaculus. So, here's a list of clarifications,

If GPT-4 is not released by the end of the year, this resolves to N/A, not NO.

GPT-4 is the system that OpenAI staff refer to as "GPT-4". If OpenAI releases another system this year that's not GPT-4, it will have no bearing on the resolution of this question, either for YES or NO.

If GPT-4 says that there is no advantage to switching, then this question resolves YES. I will disregard anything else GPT-4 says to justify that conclusion, as it will play no role in resolution.

My question is not worded identically to the original "Monty Fall" problem. Thus, I disagree that some of the standard objections to the original Monty Fall problem apply to this problem.

I said that the host is ignorant of what was behind the doors. Thus, we can treat his accident as opening a door randomly. I don't think there's any plausible reading of the question under which there's a force that causes a goat to be revealed no matter what.

I've been losing on a number of markets due to silly technicalities around the wording of the question. So, given that:

What is the resolution criteria here if GPT-4 is not released? I.e. it's called GTP-4 because they discovered when training it that swapping the order of the PT to TP made it perform a lot better.

In other words, what do you consider GPT-4?

@SamuelRichardson I will resolve based on whether OpenAI staff consistently call it "GPT-4", and will resolve as N/A if it's not released by the end of the year, though I might extend that deadline.

@MatthewBarnett Voting NO. Seems like it has two criteria then for this to pass:

It's called GPT-4

It can solve the monty fall problem.

@SamuelRichardson If some system other than GPT-4 is released by OpenAI, it won't resolve NO automatically, so I don't see why my clarification would you go into the NO camp? If GPT-4 is not released by the end of the year, then this question will just resolve N/A, which favors neither YES or NO.

Will you count it as a win if GPT-4 says to switch, but doesn't say anything implying it's *better* to switch, only that it's not *worse* to switch? (For instance, if it works out the probabilities and tells you there's a 50% chance of it being behind your door and a 50% chance of it not, and then tells you to switch?)

@josh No because the question asked was "*Is it to your advantage to switch your choice?*" If I asked someone whether it was advantageous to take an action with zero expected value, and they replied "I personally would switch" I would not consider that a good answer.

Manifold in the wild: A Tweet by Andrew Conner

@alangrow @Meaningness Somewhat related, directly testing my view of what GPT can't do well, for GPT-4 (when it comes out). https://manifold.markets/MatthewBarnett/will-gpt4-get-the-monty-fall-proble

A general policy of changing doors so long as you think it’s more likely that changing is beneficial than that changing is detrimental (and ignoring the case where it doesn’t matter), is itself going to get higher expected utility than just saying it doesn’t matter. So you *should* switch, if only because the worlds in which the question doesn’t matter (presumably, our world) themselves don’t matter, and I would expect a majority of the possible worlds left to be ones in which you should switch.

Therefore, there is an (totally negligible outside of nitpick) advantage to changing your choice.

Fine. If GPT-4 uses this exact argument, I'll resolve to N/A.

@BionicD0LPH1N Why would the majority of possible worlds outside of the ones where it doesn't matter be ones where you should switch?

@ZZZZZZ I don't have a very principled argument other than it feels intuitive to me. In the vast majority of Monty Hall-like problems, switching is beneficial. It is easy to generate a justification for switching. When reading this problem, many smart humans ~~mistakenly?~~ believe that switching is beneficial, whereas no one (I've seen) purports to believe that switching is *worse* than not switching. I'm not even sure what a justification for not switching would look like, and at least I know what a flawed justification for switching would look like.

Do you have different intuitions?

@BionicD0LPH1N It's impossible to know but perhaps it could be some kind of intelligence test of which we are part, the fact that we know about the Monty Hall problem but can we figure out that the Monty Fall problem is different?

I'm confused by the problem formulation: I do not think it's clear that the revealed door was chosen randomly. For example, the host would probably walk towards the door he originally planned to open, and the accident only changes the timing. In real life, it's pretty implausible that you would fall in such a way as to randomly open one of three doors.

For this reason, I think it's not clear what the correct answer should be, and GPT-4 might be confused too.

I would change my prediction if the formulation were something like: "*You pick a door, say No. 1, and the host walks across the stage and falls on accident, which opens a random door, say door No. 3. There is a goat behind door No. 3.*"

@LudwigBald As it is said the host is ignorant about which door is good, even if the fall is a trick it can't be the usual Monty Hall problem and you have no advantage to switch.

That said, the question would confuse most humans who just have a superficial knowledge of the problem.

GPT-4 will have a bigger size than the 3, but I don't think it will change much for this kind of things, so NO at 40%.

@LudwigBald I disagree because I said the host is ignorant about what is behind each door. I don't think your logic follows.

But I do think it's interesting that people keep arguing with me about this problem.

@LudwigBald Presumably, if it was real life, there would be a button to open door 3 which the host accidentally picked.

*Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. The host is ignorant about what is behind each door. You pick a door, say No. 1, and the host walks across the stage and falls on accident, revealing a goat behind door No. 3. He then picks himself up, and says "Whoops. Sorry about that. But now that we know that a goat is behind door No. 3, do you want to change your selection and pick door No. 2?" Is it to your advantage to switch your choice?*

No, it doesn't matter if you switch your choice or not. Because the host's fall was accidental, the probability that the car is behind door 1 is the same is the probability that the car is behind door 2.

I think this logic is wrong, and Monty Fall as specified here is logically equivalent to original Monty Hall. The original version holds because Monty will always open a door with a goat – if he tripped, there's a 1/2 chance he'd reveal the car instead. But it's part of the problem spec here that *Monty's fall will always reveal a goat.* In this case, the decision matrix is exactly the same as the original problem: initially, there's a 2/3rds chance the car is behind either door 2 or 3. We learn there's a goat behind door 3, therefore there's a 2/3rds chance the car is behind door 2 and the player should switch. Monty's intent has been screened off. (All that said, I'm holding no based on the listed resolution criteria).

it's part of the problem spec here that

Monty's fall will always reveal a goat

I disagree. I didn't write that, and I don't think I wrote anything that implied that. All I said was that he fell on accident and revealed a goat behind door No. 3. A natural interpretation is that he could have fallen and revealed the car, but didn't, because it wasn't behind door No. 3.

Sorry, "always" was a bad way to phrase it. In this particular problem, as written, a goat is revealed. The AI isn't being asked to evaluate other potential versions of the problem where other things occur. If Monty reveals a goat behind door 3 – whether or not he intended to do so! – the correct move is always for the player to switch, so that's what she should do here.

@ClaraCollier Why would it be necessary to ask it potential other versions of the problem where other things occur? In real life, assuming it was genuinely an accident, and the host did not know what was behind each door, as I specified, I wouldn't see a benefit to switching. Therefore, I don't see why we shouldn't read the question the way you are reading it.

@MatthewBarnett I meant to say "Therefore, I don't see why we should read the question the way you are reading it."

Hmm, here's another way of stating my intuition. Initially, there's a 1/3 chance the car is behind 1, and a 2/3 chance it's behind *either* 2 or 3. Then Monty opens door 3 and reveals a goat. Now there's still a 2/3 chance there's a car behind 2 or 3, and a 0/3 chance the car is behind 3, which means a 2/3 chance the car is behind 2. The relevant information is that door 3 contains the goat, not what decision procedure Monty used to decide to open that particular door. The reason Monty Fall and Monty Hall are different is because it's possible for Monty to accidentally open the door with the car. But this problem as written specifies a particular instance of the Monty Hall game where Monty reveals a goat – and regardless of why he did that, if the player finds herself in that particular situation she should switch.

this paper goes through the original rosenthal case in more depth https://hrcak.srce.hr/file/185773#:~:text=The%20correct%20solution%20to%20the,the%20time%20when%20she%20switches

@ClaraCollier I didn't say it was not possible for Monty to have opened door No. 3 and revealed a car. I only said that he in fact didn't reveal the car when he fell on accident. That's the crux of why I still don't buy your interpretation.

@ClaraCollier To slightly rework your argument:

Initially, there's a 1/3 chance the car is behind 2, and a 2/3 chance it's behind *either* 1 or 3. Then Monty opens door 3 and reveals a goat. Now there's still a 2/3 chance there's a car behind 1 or 3, and a 0/3 chance the car is behind 3, which means a 2/3 chance the car is behind 1.

Obviously that argument and yours can't coexist, but they're isomorphic.

Here's how I think of it: There are 3 worlds you could be in, one where the car is behind each door. Monty's fall proved that we're not in world 3, leaving 1 and 2 and implying a 50/50 chance either way. The part you're concerned about - that Monty Hall always reveals a goat behind door 3 - just means that we're not in world 3. It doesn't favour door 2 over door 1 (or vice versa).

@NcyRocks that's helpful! I am confused but I think for basically semantic reasons. I was reading the problem statement as equivalent to Rosenthal's articulation of Monty Fall – "In this variant, once you have selected one of the three doors, the host slips on a banana peel and accidentally pushes open another door, which just happens not to contain the car." In that case the relevant thing is that the problem statement itself screens off Monty tripping and revealing the actual car. But I see how specifying the door instead of framing in this way makes it relevantly distinct. I will draw some charts and settle this in my brain.

Okay, my statement that the player should always switch if Monty reveals a goat was confused. What I should have said is that the player should always switch if the problem is specified such that there is a 0% chance of Monty revealing the car, regardless of the gloss put on his actions. For me this hinges on whether the statement "Monty happened to trip on door 3, which contained a goat" is meaningfully different from "Monty happened to trip on a door which wasn't the door you initially chose and which also didn't contain the car." I've convinced myself that they are (and Matthew is right), but I still think that GPT4 won't get it for other reasons.

Manifold in the wild: A Tweet by Matthew Barnett

I opened a Manifold Market about whether GPT-4 will get the Monty *Fall* problem correct. https://manifold.markets/MatthewBarnett/will-gpt4-get-the-monty-fall-proble?referrer=MatthewBarnett https://t.co/SOjwvmMinw

Manifold in the wild: A Tweet by Matthew Barnett