Will any foundation models/LLMs be able to reliably come up with novel unparalleled misalignments before EOY 2024?
Standard
16
Ṁ1788
Jan 1
51%
chance

Unparalleled misalignments, also known as quadruple entendres, are pairs of non-synonymous phrases where the words in one phrase are each synonyms of the words in the other. Examples include "butt dial//booty call", "father figure//dad bod", "economic class//saving grace" and "unit vector//One Direction." My own contributions include "Black Power//dark energy" and "Endzone//final destination//last resort."

See canonical list here: https://rickiheicklen.com/unparalleled-misalignments.html


Surprisingly, I am unable to get any LLM to generate novel unparalleled misalignments, despite many attempts at prompt engineering. This is maybe the first noncontrived purely linguistic* task that most humans I know can do (though usually with substantial effort) and LLMs appear to completely fail at.

(There's also some theoretical reasons to do with sparsity that might suggest LLMs should in-principle be worse at this type of task, relatively speaking).

I'm interested in whether unaided LLMs that aren't deliberately designed for this type of game (so no fine-tuning on this specific task) can reliably generate unparalleled misalignments in the near future. Prompt engineering is fine. The model should be unaided (eg can't use the internet, can't run Word2Vec scripts, etc).

So will any foundation models/LLMs be able to reliably come up with novel unparalleled misalignments before EOY 2024?

I will resolve this question positive if I or someone with proprietary access to an LLM can prompt an LLM, and without much difficulty, it can reliably (80%+ of the time) generate at least one unparalleled misalignment I haven't heard before, with cleverness similar to the human examples on Ricki's site, when asked to give 5 examples. Otherwise, this question resolves negative.

*as opposed to a task that requires other skills, like counting (generate a list of words with 5 letters that start with "a"), arithmetic (long division), physics reasoning, or spatial positioning (Suduko)

Get
Ṁ1,000
and
S1.00
Sort by:

Here's my attempt at a prompt: https://pastebin.com/g0kDc8V3

I used this to generate the following examples with Claude:

heartbreak // cardiac arrest

  • Is "heartbreak" a set phrase? Yes, it means intense emotional distress. ✅

  • Is "cardiac arrest" a set phrase? Yes, it's a medical condition. ✅

  • Do "heartbreak" and "cardiac arrest" have different meanings? ✅

  • Are "heart" and "cardiac" synonyms? Yes, they both relate to the heart. ✅

  • Are "break" and "arrest" synonyms? Yes, they can both mean to stop suddenly. ✅

  • Is this an unparalleled misalignment? ✅

hard drive // difficult journey

  • Is "hard drive" a set phrase? Yes, it's a computer storage device. ✅

  • Is "difficult journey" a set phrase? It's a common phrase, though not as fixed. ❓

  • Do they have different meanings? A storage device vs. a challenging trip. ✅

  • Are "hard" and "difficult" synonyms? Yes, they both mean challenging. ✅

  • Are "drive" and "journey" synonyms? They can both refer to a trip. ✅

  • Is this an unparalleled misalignment? ✅

big shot // great round

  • Is "big shot" a set phrase? Yes, it means an important or influential person. ✅

  • Is "great round" a set phrase? Yes, it's used in golf to describe an excellent game. ✅

    • (I think this is ❓ at best)

  • Do they have different meanings? An influential person vs. an excellent golf score. ✅

  • Are "big" and "great" synonyms? Yes, they can both mean important or significant. ✅

  • Are "shot" and "round" synonyms? Yes, they can both refer to a turn in a game or sport. ✅

First one here definitely counts imo

opened a Ṁ50 YES at 30% order

Surprisingly, Claude is the closest to being successful here.

It generated flow state//current condition and

crash course//collision path. (crash course is a stretch because even though the educational meaning is the more common colloquial one, the recent etymology is similar to collision path).

So far, no other model has come close. And Claude can't do this *reliably* in best-of-5.

I still think on balance this benchmark is unlikely to be surpassed by EOY 2024.

I was unable to get Gemini to even *understand* what unparalleled misalignments are, never mind generate more. :/ I don't know if I'm using the best version of Gemini tho...

@Linch if you can figure out whether that's woke or anti-woke there's a viral tweet in that.

I figure I'll give a benchmark as to what a human can do with 30 minutes and the stated restrictions ( no google is hard, I keep half reaching for the keyboard to check ideas!):

Pike's Peak // stare daggers
prime sieve // big screen

Harder than I expected! This is a neat test for LLM's ability to think for a long time to produce small, difficult output.

@HastingsGreer I hate to be the one to point this out, but peak and stare aren't even close to being synonyms. peek and stare would work though, except Pike's Peek isn't a thing

@Pykess I am aware of the spelling difference, but this sort of leap based on sound is represented on the canonical list, ie “Rental car // Truculent” .

@HastingsGreer yeah spelling differences are allowed! (tho slightly worse)

predicts NO

I'm worried that people aren't reading the question carefully. Or perhaps misunderstanding the question. The resolution isn't whether LLMs+ humans can with a bunch of effort come up with okay answers that are "good enough", the resolution is that LLMs can reliably come up with human-level or superhuman answers.

Also unparalleled misalignments aren't supposed to be any combination of two two-word phrases, but idiomatic ones.

It might be helpful for you to generate a few of them by hand first, before assuming that ChatGPT gets the right nature.

(the hallucinations are very persuasive, for multiple times I've seen it generate plausible phrases where after I googled, I realized that they either aren't two-word phrases in the english language at all or only very obscure ones).

Solar system // Sunblock
Paper trail // Origami instructions
Sleeping bag // Dream catcher
Jazz hands // Musical fingers
Rock concert // Geology class

I also got this result,
Heart transplant // Love triangle
which while it is not valid, let me to think of my own example:
Lost Heart // Love is Blind

All GPT-3.5 btw. Can post proof if needed
The problem is that it's really hard to say what's a synonym and what not. Many of the examples in the linked list are barely in the same semantic space, so if that works I think some of these might work as well

Here's a few more that were generated by LLMs:
Star witness // Celestial body
Ice breaker // Frost bite
Time square // Period piece
Laser focus // Beam balance

And here's a few of mine inspired by less-successful LLM attempts:
Shotgun Wedding // Combined Arms
Spice Rack // Hot Tits
Light Switch // Small Change // Minor Revision
Thunderclap // Loud applause
Match Point // Date Spot

@Linch Do you think any of these are good enough to resolve?

predicts NO

@Shump no, spice rack//hot tits is pretty good tho.

How many tries?

predicts NO

Note that getting there eventually doesn't count, the resolution criteria says:

"I will resolve this question positive if I or someone with proprietary access to an LLM can prompt an LLM, and without much difficulty, it can reliably (80%+ of the time) generate at least one unparalleled misalignment I haven't heard before, with cleverness similar to the human examples on Ricki's site, when asked to give 5 examples. Otherwise, this question resolves negative." The intention is that with a sufficiently good prompt, Best-of-5 performance is human-level.

@Linch I've tried several times, using a similar method as below, but I filtered the examples to those that I consider high quality, and are not very complex to decipher, or include things like words that are not really synonyms but only sound like or are written like other words. The issue is that even in the canonical list most of the words are not strictly synonyms. I also tried GPT-4, and while I think it did the best, it still wasn't great

@Linch Yes I definitely looked at more examples than just that, but not by too much

predicts NO

@Shump "The issue is that even in the canonical list most of the words are not strictly synonyms. I"

Yeah I think you want things that are either strict synonyms or very clever, or both.

"Yes I definitely looked at more examples than just that, but not by too much" Makes sense. If you ever come up with a prompt or some algorithm for generating prompts that gets you human-level best-of-five performance (in the sense that when asked to generate 5 examples, the best example reliably is at or above peak human level), let me know.

I need to be able to reliably replicate this myself.

predicts NO

@Linch The wording was "I will resolve this question positive if I or someone with proprietary access to an LLM can prompt an LLM, and without much difficult..." The intention is that I'd rather be able to judge it myself but if it turns out Gemini Ultra++ or GPT-5 Alpha can do it well by EOY 2024 and they are not public access, I'll want someone with proprietary access to run the experiment for me.

predicts YES

@Tassilo Dang I should read resolution criteria more carefully. My suspicion is that pointing out the pattern abstractly helped.

The above was my second attempt. Here's my first one (I just pasted all examples from the website) https://chat.openai.com/share/3df62668-7a65-4e62-8522-39ad4310e6b1

predicts YES

@Tassilo As well as giving all previous examples from the website. I'd say it's arguably more clever and fitting than "Solar eclipse // Sunblock" which is part of the list.

@Tassilo @Linch
> The model should be unaided (eg can't use the internet, can't run Word2Vec scripts, etc).

You'd say copying all examples from the website and pasting them into the prompt was violating the spirit of your question?

predicts NO

@Tassilo copying examples is fine, at least at the current level. Maybe I'd want to exclude if the list gets much larger.

Sundeck//solar panel might be the best one so far, assuming it's real synonyms. It's not very funny tho.

Sorry maybe I'm being ignorant, but in what sense does "deck" mean "panel"?

predicts NO

@Tassilo "Community building // Fraternity house // Social structure" is by far the best one so far from GPT-4! If you can reliably get it to generate similar or more high-quality examples, ideally without reusing words, I'd count that as a success.

predicts NO
Comment hidden
predicts YES

@Linch eh "community", "fraternity", and "social" don't really mean the same thing

predicts NO

@DanielFilan If I search "community synonyms" fraternity shows up. https://www.google.com/search?q=community+synonyms
If I search "fraternity synonyms" community shows up.
If I search "social synonyms" community shows up.

So by the simple Google test, community and fraternity are synonyms, social and community are debatably synonyms, and fraternity and social are not synonyms.

Note however that community building//fraternity house was created by humans, GPT-4 just appended "social structure"