ARC

$4,454 raised
36 supporters
https://alignment.org/

Donate

ARC receives$0
S25.00
donation minimum
ARC is a non-profit research organization whose mission is to align future machine learning systems with human interests. Its current work focuses on developing an alignment strategy that could be adopted in industry today while scaling gracefully to future ML systems. Right now Paul Christiano and Mark Xu are researchers and Kyle Scott handles operations. What is “alignment”? ML systems can exhibit goal-directed behavior, but it is difficult to understand or control what they are “trying” to do. Powerful models could cause harm if they were trying to manipulate and deceive humans. The goal of intent alignment is to instead train these models to be helpful and honest. Motivation: We believe that modern ML techniques would lead to severe misalignment if scaled up to large enough computers and datasets. Practitioners may be able to adapt before these failures have catastrophic consequences, but we could reduce the risk by adopting scalable methods further in advance. What we’re working on: The best way to understand our research priorities and methodology is probably to read our report on Eliciting Latent Knowledge. At a high level, we’re trying to figure out how to train ML systems to answer questions by straightforwardly “translating” their beliefs into natural language rather than by reasoning about what a human wants to hear. Methodology: We’re unsatisfied with an algorithm if we can see any plausible story about how it eventually breaks down, which means that we can rule out most algorithms on paper without ever implementing them. The cost of this approach is that it may completely miss strategies that exploit important structure in realistic ML models; the benefit is that you can consider lots of ideas quickly. (More) Future plans: We expect to focus on similar theoretical problems in alignment until we either become more pessimistic about tractability or ARC grows enough to branch out into other areas. Over the long term we are likely to work on a combination of theoretical and empirical alignment research, collaborations with industry labs, alignment forecasting, and ML deployment policy.

donated $7502 months ago

donated $510 months ago

donated $29a year ago

donated $200a year ago

donated $5a year ago

donated $50a year ago

donated $10a year ago

donated $50a year ago

donated $1,715a year ago

donated $100a year ago

donated $1a year ago

donated $18a year ago

donated $1a year ago

donated $15a year ago

donated $1a year ago

donated $1a year ago

donated $1a year ago

donated $3a year ago

donated $7a year ago

donated $190a year ago

donated $400a year ago

donated $159a year ago

donated $100a year ago

donated $50a year ago

donated $44a year ago

donated $1a year ago

donated $100a year ago

donated $100a year ago

donated $180a year ago

donated $52 years ago

donated $22 years ago

donated $582 years ago

donated $382 years ago

donated $22 years ago

donated $02 years ago

donated $82 years ago

donated $72 years ago

donated $122 years ago

donated $22 years ago

donated $122 years ago

donated $12 years ago

donated $102 years ago

donated $22 years ago

donated $13 years ago

donated $03 years ago

donated $13 years ago

donated $03 years ago

donated $13 years ago

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules