By 2028 will we be able to identify distinct submodules/algorithms within LLMs?
21
1kṀ1620
2028
76%
chance

Roughly: will we be able to examine an LLM and extract some identifiable sub-module accomplishing an understandable task (e.g. "addition" or "inference on some decision tree" or "quicksort"). For instance it could be some set of neurons from layers L_1, ..., L_k that when run on its own executes the specified algorithm.

It must also be demonstrated that the LLM actually uses the submodule in some interpretable way. e.g. if the module implements quicksort, a demonstration might be that modifying the module to implement reversed quicksort causes the LLM to produce reverse sorted data when asked for sorted data.

The work must be done for an LLM at least as capable as OPT-3 66B.

The work must identify at least 10 submodules, or identify at least one while proving that no others exist.

If it turns out that the question is ill-posed in a way that can't be fixed with some minor tweaks, I'll resolve N/A.

Up until 2026 I may refine the criteria here, either in response to feedback from predictors or future research giving me a better way to ask the question.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy