The Natural Abstractions research program by John Wentworth is based on the idea that our world might be well-described by simple concepts which any method of cognition will end up converging upon. A hope is that by studying these natural abstractions and the cognitive algorithms that use them, we might produce improved interpretability tools.
In 4 years, I will evaluate Natural Abstractions and decide whether there have been any important good results since today. I will probably ask some of the alignment researchers I most respect (such as John Wentworth or Steven Byrnes) for advice about the assessment, unless it is dead-obvious.
About me: I have been following AI and alignment research on and off for years, and have a somewhat reasonable mathematical background to evaluate it. I tend to have an informal idea of the viability of various alignment proposals, though it's quite possible that idea might be wrong.
At the time of making the prediction market, my impression is that the Natural Abstractions research program has slowed down/gotten stuck, with no genuine news for maybe half a year. I was excited about the program when it started, though I came to believe that we would probably need structural changes to networks in order to be able to truly end up with extractable abstractions, which seems to sort of contradict the "naturality" requirement of natural abstractions.
More on Natural Abstractions:
i might do (on no) 15k mana at 38% or 50k mana at 50% if anyone's interested! I'd probably want to talk to tailcalled briefly about their current feelings on this before doing that though
@weissz Renormalization group flows require that one knows the underlying mechanism I think, which isn't going to be the case for most natural abstractions.
@tailcalled Perhaps the natural adaptation would be something akin to learning the distributions / forms of the micro-dynamics from the data (up to noise), and letting that induce the macro / low dimensional dynamics. Wonder if such an approach would be more resistant to overfitting (in the spirit of “grokking” https://arxiv.org/abs/2201.02177). Spitballing, don’t give my thoughts here too much weight
@weissz That sounds like a reasonable adaptation of renormalization group flows, but natural abstractions are also supposed to work for ML that only learns the macro behavior. (That's what the "natural" in "natural abstractions" means, that it works no matter what learning algorithm you use.)
4k no limit at 40% and more limits going up from there. Just a guess.
Hmm... if John creates more posts in his agent fundations series on optimization at a distance that you feel like they have achieved something important by, but it turns out that studying these natural abstractions is still not the most tractable way to tackle the problem and things would move in a different direction, would you resolve this positive? My general concern is that it will be hard to define what is part of an agenda and what isn't once researchers pivot.
@Tassilo I think pointing at a good direction to pivot also counts as an important achievement. However if the pivot seems unrelated to the natural abstractions program, I might not count it. I would probably ask the researcher whether the program provided any important insights to cause the pivot.