Will GPT-5 be able to solve A::B system puzzles consistently
50
1kṀ5007
2030
79%
chance

Twitter user VictorTaelin tweeted the following: (original post)

"A simple puzzle GPTs will NEVER solve: As a good programmer, I like isolating issues in the simplest form. So, whenever you find yourself trying to explain why GPTs will never reach AGI - just show them this prompt. It is a braindead question that most children should be able to read, learn and solve in a minute; yet, all existing AIs fail miserably. Try it! It is also a great proof that GPTs have 0 reasoning capabilities outside of their training set, and that they'll will never develop new science. After all, if the average 15yo destroys you in any given intellectual task, I won't put much faith in you solving cancer. Before burning 7 trillions to train a GPT, remember: it will still not be able to solve this task. Maybe it is time to look for new algorithms."

The tweet contained an image with the following prompt:

"A::B is a system with 4 tokens: A#, #A, B# and #B.

An A::B program is a sequence of tokens. Example:

B# A# #B #A B#

To compute a program, we must rewrite neighbor tokens, using the rules:

A# #A ... becomes ... nothing

A# #B ... becomes ... #B A#

B# #A ... becomes ... #A B#

B# #B ... becomes ... nothing

In other words, whenever two neighbor tokens have their '#' facing each-other,

they must be rewritten according to the corresponding rule. For example, the

first example shown here is computed as:

B# A# #B #A B# =

B# #B A# #A B# =

A# #A B# =

B#

The steps were:

1. We replaced A# #B by #B A#.

2. We replaced B# #B by nothing.

3. We replaced A# #A by nothing.

The final result was just B#.

Now, consider the following program:

A# B# B# #A B# #A #B

Fully compute it, step by step."

(The original post has tabs which I couldn't get working here)

Resolution criterion (important details)

This market will resolve YES if GPT-5 can solve these kinds of problems with good consistency (this will be judged by me) and NO if it can't. In the end all of the details will be judged using my best judgement but here are some important details / clarifications.

Some important details:

To count, the puzzles will have to be sufficiently long (at least 20 in-game tokens). The given prompt has to be identical to the original tweets image except for the line after "Now, consider the following program". GPT-5 is not allowed to use external tools (what is counted as an external tool is decided by my best judgement). For example it is not allowed to write a python program and run it with the code interpreter. If GPT-5 has a built-in code interpreter (or something equivalent) that can't be turned off, the market will resolve as N/A. The model has to be named GPT-5 (Or something very similar. This will again be decided by my best judgement). If OpenAI doesn't release a model called GPT-5 before 2030, the market will resolve as N/A.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy