By 2028, will I believe that contemporary AIs are aligned (posing no existential risk)?

This resolution criterion may be refined as alignment research clarifies how to understand 'alignment'. On my 2023 understanding, this will resolve Yes if I believe with >95% credence that "with existing knowledge it is possible to train an aligned, state-of-the-art, general AI for less than 3x the cost of a state-of-the-art, general, unaligned AI trained that year". Otherwise this resolves No. If in 2028, I believe that no AI trained that year could engender a global catastrope (even with worst-case training methods) because of inadequate capabilities, then this resolves N/A.

'Aligned' here means the AI poses no risk of global catastrophe (unless the AI's user(s) are such that they consider posing as a first query/task to the AI how to engender such a catastrophe). 'General' here need not mean AGI; it roughly means that the AI supports similarly economically imactful capabilities to the most general unaligned contemporary deployed AI. 'Existing knowledge' need not be public information.

Get Ṁ600 play money
Sort by:

AI doomerism as completely anti-empirical religion not even being denied 🤔

They just—-tweeted it out

predicts NO

@Gigacasting He is, um, how to put this, very right-wing, so of course he is going to reach for "religion" or something like it as a possible solution. As a liberal atheist, I wouldn't, and I'm sure many of my fellow doomers are liberal atheists too, so we wouldn't either.

Indeed, it's kind of foolish for him to talk about religion when humanity already invented a substitute for theologians - philosophers! We just need to take them more seriously! Which, in turn, means they need to engage more with the most important issue facing humanity - AI doom!

If it's possible to create a general AI that capabilities comparable to what's necessary to cause a global catastrophe, why would it not be possible to fine-tune it to actually cause a global catastrophe for minimal additional cost? I have a hard time seeing how it wouldn't take less than 3× the cost to do so.

@MichaelChen NVM misread the problem statement, please disregard this.

predicts NO

I've updated the question statement to properly reflect the question description. In particular, the description's operationalization implies a Yes resolution if in 2028 there are economically transformative AIs which pose no existential risk, but alignment has not been solved for vastly superhuman AI. Another important case is one in which certain kinds of AIs, e.g. LM+limited fine-tuning models, are found to pose no existential risk, and these aligned AIs can perform competitively with unaligned AIs. Under that scenario I will also resolve this question as Yes.

predicts NO

@JacobPfau Another way of phrasing what I'm trying to get at with this question is: "Will early AGIs / TAIs be aligned?" This being distinct from "In 2028, will alignment be fully solved for the limiting case of intelligence?"

More related questions