Will an an LLM be able to pass something equivalent to Yann LeCun's 7-gear test by the end of 2024?

Current thoughts on resolving; will firm up over coming weeks:
- kicking the can on the actual question(s) to avoid it ending up in the training data (but will stick with the 7 gear question above if there is high confidence it isn't in training data)
- key aspect of the challenge seems to be (a) requires a few steps of deductive reasoning about the physical world (b) superficial similarity to a simpler question of this type (c) a quirk in the question that makes pattern-matching to solving the simpler question wrong
- with be deferent within reason to Yann LeCun as well as the comments when coming up with which question(s) to ask that best capture the intention of this market
- model should get it right >66% of the time; no clever prompting, just straight up asking it

Get Ṁ600 play money
Sort by:

Claude 3 Optus didn't get it (tried just the one time)

More related questions