
What alignment proposals and research directions will I be excited about by the end of 2023?
7
680Ṁ353resolved Jan 1
100%72%
Infra-bayesianism
13%Other
0.7%
Outsourcing alignment of AI to other AI
0.3%
Reinforcement Learning from Human Feedback (RLHF)
0.3%
Transparency tools
0.3%
Imitative amplification
0.3%
Intermittent oversight
0.3%
Relaxed adversarial training
0.3%
Approval-based amplification
0.3%
Microscope AI
1.1%
STEM AI
0.3%
Narrow reward modeling
0.4%
Recursive reward modeling
0.4%
AI safety via debate with transparency tools
0.4%
Amplification with auxiliary RL objective
0.4%
Shard theory mechanistic interpretability
1.8%
6%
Hodgepodge alignment
0.4%
Cyborgism
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ60 | |
2 | Ṁ14 |
People are also trading
Related questions
By the end of 2025, which piece of advice will I feel has had the most positive impact on me becoming an effective AI alignment researcher?
Will I think that alignment is no longer "preparadigmatic" by the start of 2026?
18% chance
Will some piece of AI capabilities research done in 2023 or after be net-positive for AI alignment research?
81% chance
Will there be more alignmentforum posts from 2025 than 2024?
55% chance
Will taking annual MRIs of the smartest alignment researchers turn out alignment-relevant by 2033?
7% chance
Will >= 1 alignment researcher/paper cite "maximum diffusion reinforcement learning" as alignment-relevant in 2025?
19% chance
Will we solve AI alignment by 2026?
4% chance
Will there exist a compelling demonstration of deceptive alignment by 2026?
70% chance
Will xAI significantly rework their alignment plan by the start of 2026?
32% chance
Will I have a research position at Anthropic (Research Engineer included) by the end of 2025?
13% chance