Will QACI turn out to be a viable alignment plan?

1kṀ1788

2099

20%

chance

ALL

https://www.lesswrong.com/posts/4RrLiboiGGKfsanMF/the-qaci-alignment-plan-table-of-contents

AI Alignment

Get

1,000

to start trading!

People are also trading

Will xAI significantly rework their alignment plan by the start of 2026?

19% chance

Will QACI alignment affect p(doom) more than 5%?

3% chance

Will we solve AI alignment by 2026?

2% chance

Will Meta AI start an AGI alignment team before 2026?

15% chance

In 5 years will I think the org Conjecture was net good for alignment?

57% chance

Will we “muddle through” alignment?

66% chance

Is AI alignment computable?

50% chance

Will the 1st AGI solve AI Alignment and build an ASI which is aligned with its goals?

17% chance

Will a major AI alignment office (eg Constellation/Lightcone/HAIST) give out free piksters to alignment ppl by EOY 2027?

43% chance

Will Inner or Outer AI alignment be considered "mostly solved" first?

10 Comments

21 Holders

63 Trades

Sort by:

predictedYES

note that QACI does not intend to be a full alignment plan, merely a plan for a formal goal which produces nice things when maximized.

an AI which takes as input QACI and maximizes it is also required, for a full alignment plan.

Prior on something being a viable alignment plan is quite low, and I suspect that QACI in particular runs into the problem of being impossible to do in full and not having good approximations

Does viable mean it succeeds in creating aligned AI, or some intermediate goal?

predictedYES

@KatjaGrace I would currently bet yes at 50% on "succeeds at creating aligned ai sufficient to produce utopia with no further design work". the only other candidate I'd do that with is the one I sketched in my market about what a promising alignment plan looks like. QACI is not quite ready to use, though; it's possible an invalidating counterexample will be found that breaks the whole thing, but right now it seems like it nails several of the hard alignment components while also getting soft alignment close to right.

predictedYES

@L more theory processing is needed to actually flesh it out into concrete steps, but having been a deep learning nut for a long time, this is the first time a MIRI-style plan has looked exciting to me. (it took me quite a while to be convinced it didn't require actually simulating the universe from the beginning, though.)

My main issue with it is the risk of invoking the teleporter problem. I think we can fix that without departing from being QACI. a well-designed QACI impl shouldn't, in my opinion, actually need a strong pivotal act; weak/local pivotal acts should do.

@KatjaGrace Viable means that it will succeed in creating aligned AI, or that it will be judged to have a meaningful chance of doing so in counterfactuals where it is attempted.

predictedYES

@NoUsernameSelected anything you can critique?

predictedNO

@L Not especially, it just seemed like 77% was very high for one specific, highly speculative/still-in-development alignment plan. I do think something QACI-like or a downstream plan is one of the better alignment hopes we currently have.

bet against me and explain why pls

predictedYES

@L im very convinced that QACI is a viable plan (well, minus the pivotal act part, but I think it's such a good plan the resulting ai would not dare attempt a pivotal act)