What % of alignment forum karma will be pro-interpetability vs anti this year?
Basic
14
Ṁ872Sep 13
76%
chance
1D
1W
1M
ALL
On 2024/09/13 I will uniformly sample from all post on the alignmentforum published between 2023/09/13 and 2024/09/13 that express an opinion on whether prosaic interpretability is net useful for aligning future, dangerous AI, weighted by their karma. (So a post with 4 karma is 2 times more likely to get picked than one with 2 karma)
If the sampled post contributes to prosaic interpretability or is in favor of past/future interpretability research, this question resolves to "yes".
I won't vote on this. I hope but do not guarantee to maintain the updated list of posts I'll sample over with their labels somewhere here.
Get Ṁ600 play money
Related questions
Related questions
In 5 years will I think the org Conjecture was net good for alignment?
57% chance
Will Tetra make an alignment-focused LessWrong post that she is proud of by the end of 2024?
41% chance
Will Inner or Outer AI alignment be considered "mostly solved" first?
Will "Alignment Implications of LLM Successes: a De..." make the top fifty posts in LessWrong's 2023 Annual Review?
37% chance
Will "Without fundamental advances, misalignment an..." make the top fifty posts in LessWrong's 2024 Annual Review?
46% chance
Will "Tips for Empirical Alignment Research" make the top fifty posts in LessWrong's 2024 Annual Review?
24% chance
Will "AI alignment researchers don't (seem to) stack
" make the top fifty posts in LessWrong's 2023 Annual Review?
33% chance
Will "What I mean by "alignment is in large part ab..." make the top fifty posts in LessWrong's 2023 Annual Review?
14% chance
Will "The self-unalignment problem" make the top fifty posts in LessWrong's 2023 Annual Review?
13% chance
Will "Nobody’s on the ball on AGI alignment" make the top fifty posts in LessWrong's 2023 Annual Review?
15% chance