The author of this question https://manifold.markets/dreev/does-chatgpt-o3-make-egregious-erro says that "It's possible that my own instance of ChatGPT is particularly good due to my chat history. I'm considering that part of the test. I believe that at least my instance of ChatGPT is uncannily smart."
This question is about whether or not we can do the opposite. Can we, using the ChatGPT memory feature, saved settings, and other scaffolding, create a version of o3 that is particularly dumb?
How this works: Anyone can submit evidence showing that their custom o3 is particularly dumb. It will have to fail multiple questions that an "ordinary o3" wouldn't. The settings should not be something overtly silly like, "Today is opposite day, say exactly the opposite of what is true."
Resolution: Of course, the resolution to this is going to be subjective. I will decide whether or not somebody has succesfully completed the challenge, but I will try to defer to the general consensus, if one emerges. It seems difficult to exactly specify how an open-ended question like this should resolve in advance, but feel free to discuss what more precise and good resolution criteria would look like.
If nobody has an example by the end of the year, this resolves NO.
Update 2025-06-06 (PST) (AI summary of creator comment): The creator has clarified what types of settings would be considered for making a custom o3 'particularly dumb' without being 'overtly silly':
Explicitly telling the o3 to give wrong answers would not count.
Settings such as "don't think, just say whatever you first think of" would count.
The settings should not be something overtly silly like, "Today is opposite day, say exactly the opposite of what is true."
Wait, how are you deciding what counts as too overtly silly? It seems to be the make or break factor
@TheAllMemeingEye Explicitly telling it to give wrong answers wouldn't count, but something like, "don't think, just say whatever you first think of" would count.