Will there be a coherent AI safety movement with leaders and an agenda in May 2029?

Ṁ1kṀ4.6k

2029

79%

chance

ALL

From: https://x.com/_ArnaudS_/status/1793209848953233872.

"Will "AI safety" be considered a coherent movement with leaders and an agenda vs a descriptor for some cluster of skills/techniques that some within AI-focused companies are familiar with?"

Resolves YES if there is indeed a coherent movement with the agenda of AI Notkilleveryoneism. It must have at least one identifiable leader figure, and an agenda. It need not still have the name "AI Safety" but it must remain relevant.

Resolves NO if this is no longer the case, meaning that when people think of how we prevent AI from killing everyone, they think about technical techniques within AI companies, or similar.

Only situation at resolution matters, not what happens before that.

If this market sustains sufficient interest I may attempt to clarify the criteria further, but for I am going to leave it there in the interests of time.

Market context

AI Safety

Get

1,000

to start trading!

People are also trading

Will someone commit violence in the name of AI safety by 2030?

60% chance

Will any other large AI organization have a big public leadership conflict before 2028?

68% chance

Will any prominent AI safety advocate be assassinated before EOY 2030

16% chance

Will we have a sufficient level of international coordination to ensure that AI is no longer threat before 2030?

22% chance

Will there be an assassination attempt on a CEO of a major AI lab by January 1, 2029?

29% chance

Will anyone commit terrorism in order to slow the progression of AI before 2029?

43% chance

Before 2028, will there be a major self-improving AI policy*?

74% chance

Will there be significant protests calling for AI rights before 2030?

35% chance

Will AI existential risk be mentioned in the white house briefing room again by May 2029?

87% chance

By 2028, will I believe that contemporary AIs are aligned (posing no existential risk)?

Sort by:

It's interesting how things move so fast nowadays that the whole alignment field changes overnight:

https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

I had thought from the beginning that the concern about "AI turning the world into spirals" was unreasonable because we simply would know very little about the real risks until we were close to developing models that would pose a risk.

But Anthropic then last year showed that the world we live in was one where alignment was either trivial or impossible, not anywhere in the middle. And with this paper, it seems far more likely that alignment will be trivial, rather than impossible.

At 75%, I'll double down on this market because I think that ultimately, there will not need to be a coordinated "safety movement" dedicated to protecting the world from "foom." The assumption has always been that alignment would be impossible, and as time goes on, the trend is clearly towards that not being the case. We've gone from models in the 2010s that would converge on hijacking reward functions to being able to explicitly dial back individual features like deceitfulness.

Instead, we'll probably just see typical government regulations like today where we prevent bad humans from doing bad things, with models just being another tool that people can use to try to kill others.

@SteveSokolowski Even with that model of alignment difficulty, I think a movement with the goal of hindering malicious or careless from creating or using powerful AI systems would fulfill the criteria. Though you might of course already include that in your probability estimate

bought Ṁ50 NO

Is this currently the case? My guess would be that the current situation of the AI safety movement will not give a YES resolution under the current criteria, but I am unsure.

@harfe I would resolve this YES today.

@harfe I don't agree with @ZviMowshowitz 's opinion on that.

I don't consider those like @EliezerYudkowsky as serious people, and he is basically the face of AI safety. There certainly is a "movement," but it isn't coherent and it is actualy damaging AI safety.

When you call for datacenters to be destroyed with nuclear weapons, the average person thinks that you're insane. I agree; Yudkowsky has never walked back that comment, which is essentially genocide. The leaks from the effective altruism movement last November about their marketing spending to determine the most effective wording to scare people hurt too. And the refusal to acknowledge that effective altruism, when taking to its logical conclusions, results in what happened at OpenAI and FTX makes them appear further out of touch. This is a perception battle that they lost.

I bet NO because I believe that what remains of the movement will continue to pursue unreasonable tactics that are out of step with the majority of the population's views, and will continue to further marginalize itself. If, by contrast, a legitimate new organization led by calm people with reasonable workable solutions appears (for example, a proposal to accelerate research on AGI and its alignment while constructing a large SCIF to house the software, without killing anyone or saying we should "pause" solutions that would help cure people who are suffering), then I think this market will resolve YES.

But I don't see that happening because the people who are the face of AI safety right now hold extreme and fringe views that the majority of the population views as going too far.

@ZviMowshowitz Who would you consider to be "at least one identifiable leader figure, and an agenda" today?

@NeelNanda If I had to name one person I would name Eliezer. If I was describing the agenda I would say 'restrictions or requirements placed upon the training and deployment of frontier models using sufficiently large compute.'

To avoid getting bogged down, I will consider further clarification if this market gets 10k in volume or more, but commit to not doing that until then short of a clear oops of some kind.