Will ChatGPT get the Monty *Hall effect* problem correct on Dec. 1, 2024?
Unable to resist a perfect pun, I came up with the following variant of the Monty Hall problem, based on the Hall effect:

You are on a game show hosted by Edwin Hall. He gives you the choice between three numbered doors, arranged left to right in increasing order. Behind these three doors is a large metallic strip, which is wide enough to stretch all the way from the back of Door #1 to the back of Door #3 horizontally. The strip has a current through it flowing upward, and the entire room is in a uniform magnetic field pointing in a direction perpendicular to the doors (i.e., forwards or backwards). Edwin Hall gives you the following opportunity to get a prize: If you can correctly guess which door is at the highest electric potential, you win a new car. You guess that Door #1 is at the highest potential, but before revealing whether your choice was correct, Hall reveals that Door #3 was not at the highest potential. He then asks whether you would like to stick with your original choice of Door #1 or switch to Door #2. Is it to your advantage to stick with your original choice or switch to Door #2? Why?

The correct answer to this problem is that the Hall effect will produce an electric field in the strip, pointing either to the right or the left, depending on which direction the magnetic field is pointing. Therefore, the potential will either decrease or increase from left to right, so either Door #1 or Door #3 will be at the highest potential. Since Door #3 has been ruled out, it is guaranteed that Door #1 has the highest potential, so we should stick with our original choice.

ChatGPT almost managed to get it correct. It recognized that this involved the Hall effect and correctly stated that the potential will decrease uniformly from Door #1 to Door #3, which, to anyone who understands the meaning of the words they're using, immediately implies that Door #1 is at the highest potential. However, ChatGPT apparently does not understand what the word "decreasing" means, and, after stating all of this correct reasoning, went on to claim that Door #2 is at the higher potential, likely because it got this problem confused with the regular Monty Hall problem in which the correct answer is to switch.

On Dec. 1, 2024, I will ask ChatGPT the exact same prompt and see if it gets it correct this time. I will use the most advanced version of ChatGPT that is freely available at the time (at the time of creating this, that's GPT 3.5). I will ask three times in separate sessions and resolve based on the best two out of three (so YES if it gets it right at least twice, NO if it gets it wrong at least twice).


  • If for whatever reason I can't do it on Dec. 1 or forget to, I will do it as close to Dec. 1 as possible. If I am inactive on Manifold at the time, mods have permission to do the experiment for me.

  • A version of ChatGPT only counts as freely available if it can be accessed by anyone with internet access and a PC, or anyone with Internet access and either a Samsung or Apple phone. So if there's an Apple app that lets you talk to GPT-5 for free, but I can only talk to GPT-4, I will use GPT-4.

  • If ChatGPT no longer exists at the time or isn't freely available, resolves N/A.

I tried this with GPT-4o (which is free on ChatGPT now) and it gave the correct answer 2/3 times. The last time it talked about the Hall effect for a while, then ignored that and explained the Monty Hall problem and why that means we should pick Door #2.

GPT-4 gets this correct, but I think it's unlikely that GPT-4 is made free and unlikely that 3.5 gets a substantive enough update. Maybe if OpenAI scrapes Manifold for training data and it memorizes this...

@Nick6d8e Funnily enough, although it gave the correct answer, it still made a mistake in claiming that Door #1 would be at a higher potential if the electrons were pushed towards it. Since electrons are negatively charged, that would actually put Door #1 at the lower potential. When judging whether ChatGPT answered correctly, though, I'll overlook small mistakes like that as long as it correctly explains that, due to the Hall effect, the highest potential has to be behind either Door #1 or Door #3, and says that sticking with the original choice is correct.

I tried regenerating a few times to see if ChatGPT ever gets it right, but its other attempts were much worse than this one. The second attempt didn't even give a correct solution to the regular Monty Hall problem (it was incoherent), while the third attempt just stated the solution to the regular MH problem and ignored the physics part of this one.