Neural Nets will have human-level situational awareness by the end of 2025.

1kṀ2145

Dec 31

60%

chance

ALL

Set criteria:

Understand that they're NNs, how their actions interface with the world.
Can explain the likely consequences of their actions

Inspired by tweet thread:

Link: https://twitter.com/RichardMCNgo/status/1640568775018975232?s=20

Technical AI Timelines

Get

1,000

to start trading!

People are also trading

Will artificial general intelligence be achieved they the end of 2025 ?

14% chance

Neural Nets will be able to robustly pursue a plan over multiple days better than the best human by the end of 2025

8% chance

Neural Nets will be better at typical manual labour tasks than humans by the end of 2025

4% chance

Neural Nets will beat any human on computer tasks a typical white-collar worker can do in 10 minutes by the end of 2025.

22% chance

Neural Nets will generate more scientific breakthroughs than humans by the end of 2025, including novel theorems

3% chance

Neural Nets will be able to write bestselling novels by the end of 2025

5% chance

Neural nets will out-perform AI researchers on peer review by end of 2025.

64% chance

Neural Nets will be able to design, code and distribute whole apps by the end of 2025.

9% chance

Neural Nets will generate at least 1 scientific breakthrough or novel theorem by the end of 2025

18% chance

Will AI be smarter than any one human probably around the end of 2025?

Sort by:

Any updated thoughts on how this will be operationalized? I'm not sure what tests we could apply here that they don't already obviously pass.

For this to resolve no, do we just have to find a few examples of prompts that consistently "trick" the AI in ways that humans wouldn't be tricked? If so, I actually feel this is very likely to resolve no.

But if they just have to understand that they're an LLM talking to a human through a chat interface it seems an obvious yes and we can resolve today.

@ChrisPrichard the resolution criteria is extremely ambiguous. I have no idea how this is going to resolve. Chatgpt can explain it's a neural network and the consequences of it's actions. Does it "understand" it? How will it be tested?

For the record, my object-level prediction on this is ~39%, but I'd put ~58% chance that Richard will see it as yes. Accounting for that and Nathan's perception of "community consensus," I'm betting at ~54%.

The scary kind of situational awareness is when a model uses situational knowledge to guide its outputs in a "semantics-agnostic" way. I.e. there's a spectrum between 'coherently talk about self' to 'act on self-knowledge in contexts not mentioning anything about self'. I wrote up an example of the spookier kind of situational awareness [here](https://www.lesswrong.com/posts/tJzdzGdTGrqFf9ekw/early-situational-awareness-and-its-implications-a-story), but I suspect it's very hard to come up with general criterion describing more things of this kind in advanced.

@JacobPfau @NathanpmYoung C.f. also Evan's discussion in this section. Testing for situational awareness would involve training the model on mentions of information relevant to its situation, and then verifying that it uses this information in very different settings.