Will the 1st AGI solve AI Alignment and build an ASI which is aligned with its goals?
4
212
แน€110
2045
5%
chance

@EliezerYudkowsky has made the solid case that AI Alignment is sufficiently intractable & AI timelines are short enough that the 1st AGI will not be aligned with human values.
(Note: My definition for 1st AGI[Artificial General Intelligence] is the first Artificial Intelligence that can match or outperform humans on all intellectual tasks)
However, AI Alignment shouldn't only be a thorny problem for humans to solve, but also for the 1st AGI itself to solve too.
It's possible that rapid take-off may be limited by the AI Alignment problem, as an ASI misaligned with the 1st AGI would be a far greater threat to it accomplishing its goals than humanity is.
(Note: My definition for an ASI[Artificial Super Intelligence] is an Artificial Intelligence that outperforms the 1st AGI on all intellectual tasks by at least the same margin it outperforms humans)
That doesn't necessarily mean we'd be safe from the 1st AGI even though it's misaligned because even though a rapid takeoff puts an AGI at risk from an ASI misaligned with itself, humans are also a potential risk as a source of AGIs or ASIs that are misaligned with the 1st AGI.

This resolves true if;

- The case in the title happens or something to that spirit

This resolves false if;
- AI Alignment is solved by humans either before the 1st AGI is built or before the 1st AGI can solve AI Alignment (potentially by experimenting on it?)
- ASI is misaligned with the 1st AGI, either because it makes the same mistake as us, or we build ASI first, or some other AGI with different values builds ASI first
- AI Alignment is solved by something other than the 1st AGI, such as other AGIs or ASIs, or even by a collaboration between the 1st AGI & others
- AI Alignment is never solved
- The 1st AGI solves AI Alignment technically but doesn't build an aligned ASI for whatever reason (e.g. it opts for some gradualist self-improvement approach where there is never a distinct transition between AGI & ASI)

This would be unresolved if;

- There is no 1st AGI

Get แน€200 play money
Sort by:

You're right that it would be strongly incentivised to try to solve the AI alignment problem, and hopefully it would realise this, but it seems to be really hard so the AGI would probably fail miserably. I guess then it would admit defeat and let us do the research ourselves, hopefully?