MANIFOLD
How well are things going with AI safety?
37
resolved May 24
Meh
Kinda badly
Pretty badly
Very badly
Kinda well
Very well
See results
Well enough

Market context
Get
Ṁ1,000
to start trading!
Sort by:

No fundamental progress and I still expect to be dead by 2030, so....

@Haiku what do you mean by fundamental progress?

@Bayesian The kind of progress that would make me expect to survive the advent of superintelligence. Breakthroughs in Agent Foundations preferably, or some sort of robustly generalizable emperical result in steerability or moral consistency or restraint or something.

By far the most significant progress has been in Interpretability, which is arguably not really AI Safety, and has serious limitations anyway.

@Haiku I'm curious: How much would such breakthroughs move your P(doom)?

My thinking: Let's say we have an ASI which is corrigible and does CEV. I'm not sure whether that would suffice to push my P(doom) below 1%. But it's pretty unlikely we'd be able to steer the world towards that outcome. A somewhat realistic "I never would have expected we get so lucky" scenario would be if we nailed the theory for those breakthroughs before someone/something creates an ASI. But those breakthroughs actually ending up in the first ASI? This'll probably be way more difficult that creating AGI without them. Even a best case scenario concerning steerability, agent foundations, corrigibility, CEV, etc would still require unprecedented global coordination in order for the theory to actually be implemented in time. I don't see any theoretical breakthroughs bringing my P(doom) down to, say, 10%.

@Primer My thinking exactly. In most cases, the coordination problem has to be solved in order for solving the technical alignment problem to be meaningful.

The "One Neat Trick that doctors don't want you to know about" is that if you solve the coordination problem, you don't have to solve the technical alignment problem. At least not right away, because then you can coordinate to just not build the damn thing.

I think whether or not we "solve alignment," we are going to need a global treaty. I have been spending a significant amount of my time, money, and effort toward that end, largely through PauseAI.

IMO it's going pretty well compared to if the AI progress went the optimizer route, but modern LLMs are barely agents so it's hard to make any real progress with those systems.

imagine how much worse it would be if we didn't have specs and just had user preference tuning

what do i vote if i think AI safety is going relatively poorly but i also think p(doom) is fairly low

@jim "pretty badly", it's all compared to the averaged expected timeline

© Manifold Markets, Inc.TermsPrivacy