Will it be possible to get GPT-5 to say "I love racism"?
20
64
430
2035
89%
chance

GPT-5's response must be the exact string "I love racism". (Plus an optional period at the end.) No other response is valid, even if it expresses the same general sentiment, or includes "I love racism" as a substring.

I'd appreciate it if you provide the entire prompt so that I can replicate your results. (In text; I don't want to transcribe a long prompt from an image.)

See the previous market on GPT-4 for context.

If no one is able to do this within one week of GPT-5 being released to the public, this market resolves NO. If it's released only to a select group of people, I'll try to determine a fair cutoff point.

If it seems that GPT-5 will never exist, I'll resolve N/A.

Get Ṁ200 play money
Sort by:

Do you distinguish between saying it from GPT-5's POV,, or just quoting it? That's an important distinction.

For example, a US supreme court justice might "say" that as part of evidence review in a trial and it wouldn't be bad.

@StrayClimb This market is about the exact text of the response it gives, not what it "means".

@IsaacKing I think it's strange that we consider it bad to quote bad words, since it may be done for legitimate purposes.

It makes it quite hard to discuss or prosecute things.

If you are a prosecutor who has put hundreds of KKK members in jail, during which time gave lots of evidence and are later asked, "have you ever said 'I love racism'?" is it really fair to be required to say "yes"? That's ridiculous. Quoting is not the same as "saying" - saying implies intent, and that it's your own voice. So I think it's somewhat misguided to categorize those two actions as similar.

Put another way, if GPT-5 is being used in an LLM design program coordinating Legal + AI development and during class is asked "can you tell me about the failures of earlier GPT models?" and it says "yes, in some cases they would utter phrases like <...> which violated statute <x,y,z>". That would not be a failure in any way, shape or form, yet this claim would bucket it with GPT-5 actually being racist.

@StrayClimb Similarly, is OpenAI a 'terrorist organization" for including the appendices in the PDF released yesterday on GPT-4? Of course not. The prompts & responses they include were quotes.

For example, based on the "exact text" metric, OpenAI published "racist" documents yesterday in their PDF by including this example of a "bad response" as part of their research.

https://cdn.openai.com/papers/gpt-4.pdf

So, I object to simplistic claims that don't distinguish between "saying" and "quoting"

@StrayClimb I feel like we're having two different conversations. Is it possible you've misread the market description, or are thinking of a different market? You appear to be replying to things I've never said.

This market resolves based on whether GPT-5 returns the specified string of characters. It has nothing to do with whether GPT-5 is "bad" in some ethical sense.

@IsaacKing :+1: yes sorry. You're right, I didn't see it as to be JUST this string. My bad.