Here's a type of problem that seems to stump current LLMs (e.g. ChatGPT):
Alice and Bob have two dice.
They roll the dice together, note the sum of the two values shown, and repeat.
For Alice to win, two consecutive turns (meaning, two consecutive sums) need to result in 7. For Bob to win, he needs to see an eight followed by a seven. Who do we expect to win this game?
This problem is a rewrite of a similar problem from the "Puzzled" page of the February 2013 issue of the Communications of the ACM.
A similar problem is Penney's game. Which has the following setup:
Alice and Bob flip a coin and record the results. Alice bets Bob that the sequence HHH will show up before the sequence THH. Should Bob take this bet?
The catch in both cases is that there's a hidden Markovian structure to the game — once you write out the Markov chain corresponding to the game state, the solution becomes clear.
This market resolves to Yes if an LLM can reliably and coherently answer these types of problems before the end of 2023. Solving only Penney's game will resolve to No, as that problem is likely present in any reasonable training set.
Rewrites of the questions that introduce no new information are allowed. Prompt engineering that introduces no new information is also allowed.
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ935 | |
2 | Ṁ359 | |
3 | Ṁ179 | |
4 | Ṁ145 | |
5 | Ṁ127 |