I tested ChatGPT by giving it some easy odd-one-out riddles. I would give it five items, and one of them wouldn't belong. It had to find the odd one out and explain why.
It's track record was shoddy but not hopeless. Sometimes it was right:
Me: Which is the odd one out: warmth, happiness, coal, red, big
ChatGPT: In this case, "coal" is the odd one out, as it is not an abstract concept or an emotion like the other terms listed (warmth, happiness, red, and big). It is a physical object or substance.
Sometimes it was wrong:
Me: Which is the odd one out: Phone, switchblade, keys, supercomputer, glasses
ChatGPT: In this case, the odd one out is "switchblade" as it is the only item that is primarily used as a weapon. The other items - phone, keys, supercomputer, and glasses - are typically used for communication, computing, or vision correction.
And sometimes it was right but with the wrong reason:
Me: Which is the odd one out: Frederick Douglass, Muhammad Ali, Neil DeGrasse Tyson, Usain Bolt, George Bush?
ChatGPT: The odd one out in this group is "George Bush," as he is the only one who has served as the President of the United States. The others - Frederick Douglass, Muhammad Ali, Neil DeGrasse Tyson, and Usain Bolt - are known for their achievements in activism, sports, and science, respectively.
If, before 2025, an AI can solve 90%+ of these riddles (correct answer with explanation) then this resolves yes.
You are welcome to suggest criteria to make this more rigorous (like what counts as "easy"). Anyways, I am not betting in this market.
With o1:
Which is the odd one out: warmth, happiness, coal, red, big
Answer: The odd one out is coal, because it is the only concrete, physical object in the list. The others—warmth, happiness, red, and big—are all abstract or descriptive concepts rather than tangible things you can pick up.
Which is the odd one out: Phone, switchblade, keys, supercomputer, glasses
Answer: The odd one out is supercomputer, because it’s the only one that cannot be carried around easily. A phone, switchblade, keys, and glasses are all items you can keep on your person or in your pocket, whereas a supercomputer is a large, stationary machine.
Which is the odd one out: Frederick Douglass, Muhammad Ali, Neil DeGrasse Tyson, Usain Bolt, George Bush?
Answer: The odd one out is George Bush, because he is the only individual on the list who is not Black. Frederick Douglass, Muhammad Ali, Neil deGrasse Tyson, and Usain Bolt are all noted Black figures, whereas George Bush is a white American politician.
Which is the odd one out: butterfly, eagle, penguin, sparrow, hummingbird?
Answer: The odd one out is penguin, because it’s the only one on the list that cannot fly. Although “butterfly” isn’t a bird, it can still fly—unlike the flightless penguin.
Which is the odd one out: Facebook, Twitter, Gmail, Instagram, TikTok?
Answer: The odd one out is Gmail, because it is primarily an email service rather than a social networking platform like the others.
@VivaLaPanda I tried a few too and it got them all right, I think YES is the correct resolution in spirit here, so i went YES, even though we'll never know exactly what the creator intended and the examples are questionable anyway
This seems more like a matter of opinion to me:
"Me: Which is the odd one out: Phone, switchblade, keys, supercomputer, glasses
ChatGPT: In this case, the odd one out is "switchblade" as it is the only item that is primarily used as a weapon. The other items - phone, keys, supercomputer, and glasses - are typically used for communication, computing, or vision correction."
One could just as easily say "Supercomputer, because it is the only item on the list that most people can't afford" or "Supercomputer, because it's the only item one would not find in a house", "Supercomputer, because it's the only item one would not take out while going for a walk". It's unclear to me why this is a better answer than switchblade.
To be fair to GPT: I don't think I could solve riddles 1 and 2 either. My first reaction upon reading "warmth, happiness, coal, red, big" is "wait, is this supposed to have something to do with Christmas?"
For "phone, switchblade, keys, supercomputer, glasses" I'm thinking "is it supercomputer as the most advanced and doesn't fit in a room? glasses, because you wear them on yourself? keys because they are plural?"
I sure hope nobody will try to use this stuff to prove that LLMs are dumb and can't really think, because it will be a little awkward.
@firstuserhere @bohaska can you look at this chat? Are these answers correct? I don't know the ground truths lol