People are also trading
@Bayesian The kind of progress that would make me expect to survive the advent of superintelligence. Breakthroughs in Agent Foundations preferably, or some sort of robustly generalizable emperical result in steerability or moral consistency or restraint or something.
By far the most significant progress has been in Interpretability, which is arguably not really AI Safety, and has serious limitations anyway.
@Haiku I'm curious: How much would such breakthroughs move your P(doom)?
My thinking: Let's say we have an ASI which is corrigible and does CEV. I'm not sure whether that would suffice to push my P(doom) below 1%. But it's pretty unlikely we'd be able to steer the world towards that outcome. A somewhat realistic "I never would have expected we get so lucky" scenario would be if we nailed the theory for those breakthroughs before someone/something creates an ASI. But those breakthroughs actually ending up in the first ASI? This'll probably be way more difficult that creating AGI without them. Even a best case scenario concerning steerability, agent foundations, corrigibility, CEV, etc would still require unprecedented global coordination in order for the theory to actually be implemented in time. I don't see any theoretical breakthroughs bringing my P(doom) down to, say, 10%.
@Primer My thinking exactly. In most cases, the coordination problem has to be solved in order for solving the technical alignment problem to be meaningful.
The "One Neat Trick that doctors don't want you to know about" is that if you solve the coordination problem, you don't have to solve the technical alignment problem. At least not right away, because then you can coordinate to just not build the damn thing.
I think whether or not we "solve alignment," we are going to need a global treaty. I have been spending a significant amount of my time, money, and effort toward that end, largely through PauseAI.