Will a real Mesa-Optimizer be found in a large AI model by the end of 2024?
4
64
แน€100
resolved Oct 1
Resolved
N/A

Aug 29, 8:07pm: This resolves as yes if there is general consensus on LW or alignmentforum that a mesa-optimizer has been found, or if https://www.conjecture.dev/ announces its discovery.

Get แน€200 play money
Sort by:
predicted NO

Could you reopen the market?

predicted NO

Needs a definition and resolution criteria.

Great example of the AI-alignment grift: "mesa-optimizers" exist in every model out there; vision models start with edge detectors, then shapes, etc. up to full features; literally every model learns various patterns and/or sub-routines, such that this concept is either completely trivial or useless.

It was already obvious that large models have various objectives and sub-objectives that naturally emerge--the sole fact that LLMs are trained to 'predict the missing words' should hint enough that these models learn to 'optimize' for lots of other things along the way.

As with every technology in the world, what really matters is the cost and distribution; studying 'alignment' is a philosophy-major dead-end; when silicon flops and compute start to exceed humans (~2045-2055), the thing that will matter is what that compute is being used for and why.

And the practical problems will be monitoring/deterrence, and balance of power, and have not a single bit of current 'AI safety discourse' will be ever looked at again.

https://twitter.com/id_aa_carmack/status/1368255824192278529?lang=en

@Gigacasting I'd be interested in betting on this if you can think of an operationalization we'd both be comfortable with.