Will "LoRA Fine-tuning Efficiently Undoes Safety Tr..." make the top fifty posts in LessWrong's 2023 Annual Review?
Basic
0
Feb 1
14%
chance
1D
1W
1M
ALL
As part of LessWrong's Annual Review, the community nominates, writes reviews, and votes on the most valuable posts. Posts are reviewable once they have been up for at least 12 months, and the 2023 Review resolves in February 2025.
This market will resolve to 100% if the post LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B is one of the top fifty posts of the 2023 Review, and 0% otherwise. The market was initialized to 14%.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will "Lessons On How To Get Things Right On The Fir..." make the top fifty posts in LessWrong's 2023 Annual Review?
55% chance
Will "Fact Finding: Attempting to Reverse-Engineer ..." make the top fifty posts in LessWrong's 2023 Annual Review?
53% chance
Will "Towards Developmental Interpretability" make the top fifty posts in LessWrong's 2023 Annual Review?
69% chance
Will "Noting an error in Inadequate Equilibria" make the top fifty posts in LessWrong's 2023 Annual Review?
23% chance
Will "Against LLM Reductionism" make the top fifty posts in LessWrong's 2023 Annual Review?
21% chance
Will "Without fundamental advances, misalignment an..." make the top fifty posts in LessWrong's 2024 Annual Review?
46% chance
Will "Model, Care, Execution" make the top fifty posts in LessWrong's 2023 Annual Review?
22% chance
Will "AI Control: Improving Safety Despite Intentio..." make the top fifty posts in LessWrong's 2023 Annual Review?
86% chance
Will "A rough and incomplete review of some of John..." make the top fifty posts in LessWrong's 2023 Annual Review?
24% chance
Will "The Dial of Progress" make the top fifty posts in LessWrong's 2023 Annual Review?
19% chance