Conditional on a seed AI existing, will it be able to align its future iterations with itself?

240Ṁ171

2060

70%

chance

ALL

For the purposes of this market, a "seed AI" is any AI that either creates discrete "child" AIs, or engages in extensive self-modification such that the post-modification AI is significantly more powerful/different. (e.g. the seed AI jumps to a larger server and runs gradient descent for 1 year).

If such a seed AI ever exists and it creates a successor / modifies itself, will the resultant AI be aligned (in the sense of having the same implicit or explicit utility function) with the seed AI?

In other words: will a seed AI that is facing its own version of the alignment problem solve it?

It is relatively unlikely that this market will ever be resolved, but please bet as honestly as you can anyway. Out of the goodness of your heart or whatever.

AI Safety

Technical AI Safety

Get

1,000

to start trading!

3 Comments

6 Holders

8 Trades

Sort by:

Alignment seems really easy for an AGI. It can clone itself, check gradients, check the candidate on a wide variety of test inputs, property testing, search for compressed representations of novel architectures, handle partial correctness proofs by conditionally rejecting children on inputs they can't prove to be correct, etc.

With a little bit of selection pressure from even a dumb natural selection overseer so it learns to actually do all of these things, it seems plausible enough. (and with no selection pressure, early versions of the meta-algorithm would corrupt themselves into something not even functional on toy problems, and humans would remember to add it in)

@Mira I doubt the AI can do much more than we can

Alignment seems much harder than capabilities. I'd expect the first seed AI meeting your definition to be the most stupid/weak thing still able to replicate.