![](/_next/image?url=https%3A%2F%2Ffirebasestorage.googleapis.com%2Fv0%2Fb%2Fmantic-markets.appspot.com%2Fo%2Fdream%252FDlNer7SJN7.png%3Falt%3Dmedia%26token%3Dcd09e4cb-2829-4848-8435-0875a53e4652&w=3840&q=75)
TurnTrout et al asked people to predict if they would find a "truth-telling vector" that worked as an algorithmic value edit for a large language model. Here's the post where they asked for predictions:
That resolved NO, they were unable to find one. They also weren't able to find a "speaking French vector". But then a poster in the comments found one:
Will anyone find a "truth-telling vector" by 2024-10-24? I will resolve based on what I know, so hopefully if someone finds one they will tell us about it on Manifold or LessWrong to help me resolve the market. They should provide a similar quality of evidence, such as an explanation of their technique and a link to a colab.
Related questions
Seems possible that this paper could cause this market to resolve YES: https://www.lesswrong.com/posts/kuQfnotjkQA4Kkfou/inference-time-intervention-eliciting-truthful-answers-from
Although "wide range of situations" is ambiguous enough that I'm not sure if it counts. Future work in this area seems plausible too.