Will tailcalled think that the Infrabayesianism alignment research program has achieved something important by October 20th, 2026? | Manifold

Will tailcalled think that the Infrabayesianism alignment research program has achieved something important by October 20th, 2026?

22

Ṁ1kṀ4.3k

Oct 20

10%

chance

1H

6H

1D

1W

1M

ALL

The Infrabayesianism research program by Vanessa Kosoy and Diffractor is based on combining utility maximization and minimax at deep levels of mathematics. A hope is that it may help deconfuse notions of agency, in particular creating strong foundations for learning theory with provable regret bounds, and solve issues related to embedded agency.

In 4 years, I will evaluate Infrabayesianism and decide whether there have been any important good results since today. I will probably ask some of the alignment researchers I most respect (such as John Wentworth or Steven Byrnes) for advice about the assessment, unless it is dead-obvious.

About me: I have been following AI and alignment research on and off for years, and have a somewhat reasonable mathematical background to evaluate it. I tend to have an informal idea of the viability of various alignment proposals, though it's quite possible that idea might be wrong.

At the time of making the prediction market, my impression is that Infrabayesianism is lost in math. I was excited about it when it first came out as it sounded like it could solve a bunch of fundamental problems, but I had trouble working through all the math. I've been chewing on it on and off for a while, and have gained more understanding, but I still don't fully get it yet. I have become less optimistic over time, as I've realized that e.g. the update rule depends on your utility function. It seems insufficiently compositional to me.

More on Infrabayesianism:

https://www.lesswrong.com/posts/zB4f7QqKhBHa5b37a/introduction-to-the-infra-bayesianism-sequence

Market context

Get

1,000

to start trading!

People are also trading

Will tailcalled think that the Natural Abstractions alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Shard Theory alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Brain-Like AGI alignment research program has achieved something important by October 20th, 2026?

Will I think that the Belief State Geometry research program has achieved something important by October 20th, 2026?

Will Jacob Pfau think that the Eliciting Latent Knowledge research program has achieved something important by October 20, 2026?

Conditional on not having died from unaligned AGI, I consider myself a full time alignment researcher by the end of 2030

Will taking annual MRIs of the smartest alignment researchers turn out alignment-relevant by 2033?

In 5 years will I think the org Conjecture was net good for alignment?

Will ARC's Heuristic Arguments research substantially advance AI alignment before 2027?

Will some piece of AI capabilities research done in 2023 or after be net-positive for AI alignment research?

Sort by:

https://www.lesswrong.com/posts/jxE47v3aezEYaydg9/compositional-language-for-hypotheses-about-computations

Could you list some past AI alignment failures/successes, in your view?

@vluzko Alex Turner's power-seeking theorems seem worthwhile to me: https://www.lesswrong.com/s/fSMbebQyR4wheRrvk

As does the finding that model-based AI robustly avoids wireheading (I lost the link again, I should probably write up my view on it to further promote the viewpoint, idk).

I guess "factorization approaches" i.e. HCH, debate, etc, would be an example of a dead end/failure.

Will tailcalled think that the Infrabayesianism alignment research program has achieved something important by October 20th, 2026?, 8k, beautiful, illustration, trending on art station, picture of the day, epic composition

People are also trading

Will tailcalled think that the Natural Abstractions alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Shard Theory alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Brain-Like AGI alignment research program has achieved something important by October 20th, 2026?

Will I think that the Belief State Geometry research program has achieved something important by October 20th, 2026?

Will Jacob Pfau think that the Eliciting Latent Knowledge research program has achieved something important by October 20, 2026?

Conditional on not having died from unaligned AGI, I consider myself a full time alignment researcher by the end of 2030

Will taking annual MRIs of the smartest alignment researchers turn out alignment-relevant by 2033?

In 5 years will I think the org Conjecture was net good for alignment?

Will ARC's Heuristic Arguments research substantially advance AI alignment before 2027?

Will some piece of AI capabilities research done in 2023 or after be net-positive for AI alignment research?

Related questions

Will tailcalled think that the Natural Abstractions alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Shard Theory alignment research program has achieved something important by October 20th, 2026?

Will tailcalled think that the Brain-Like AGI alignment research program has achieved something important by October 20th, 2026?

Will I think that the Belief State Geometry research program has achieved something important by October 20th, 2026?

Will Jacob Pfau think that the Eliciting Latent Knowledge research program has achieved something important by October 20, 2026?

Conditional on not having died from unaligned AGI, I consider myself a full time alignment researcher by the end of 2030

Will taking annual MRIs of the smartest alignment researchers turn out alignment-relevant by 2033?

In 5 years will I think the org Conjecture was net good for alignment?

Will ARC's Heuristic Arguments research substantially advance AI alignment before 2027?

Will some piece of AI capabilities research done in 2023 or after be net-positive for AI alignment research?

© Manifold Markets, Inc.•Terms•Privacy