
Assumptions:
The AI is as good at any task as the best human in the world is at that task.
The AI is run on a large computing cluster and given free access to the internet.
The AI is unaligned, and wants to exterminate humanity without destroying itself in the process.
Humanity is aware that the AI exists, but is unaware that it is unaligned, and has not taken any particular safeguards.
Things you may not assume:
That futuristic nanobots are possible.
That the AI would be able to bootstrap itself to superintelligence.
That the AI would be capable of hacking into human infrastructure.
That the AI would be capable of convincing humans to do its bidding.
Many real people have this belief system. That even if a malicious AGI existed and were given free rein in the digital world, it would be incapable of harming humans, and this possibility is so remote that it's not even worth planning for.
The M$50,000 prize is reserved for an argument that successfully convinces one of the specific people I talk to about it that they're mistaken. I may award smaller prizes for insightful arguments that I have not already considered myself, even if they don't work. (Manifold's current bounties lock up mana upon creation, which strongly disincentivizes large bounties, so I've made this only M$1000 as far as Manifold is concerned. This is just a design flaw with Manifold's website, the actual bounty is M$50,000.)
Arguments that I think are incorrect, misleading, or otherwise "dark arts" are disqualified.
People are also trading
This seems legit hard if you can't pay or talk humans into doing things, can't find security flaws in computer software, can't invent stuff at a rate more than slightly faster than whichever single human researcher is individually best at it, and would like to wipe out all of humanity while retaining your own infrastructure. You do need to be more than 1 IQ point smarter than the smartest human to remake the world if neither persuasion nor invention is said to be possible.
The assumptions seem quite restrictive. If the AI is as good as the best humans in the world at programming AI, shouldn't it be able to recursively self-improve? Similarly, if it's as good as the best humans in the world at persuasion, it should be able to persuade people to do some amount of its bidding (though admittedly not literally everything). Similar argument for hacking.
To cause a chain reaction of political violence, you need only cause 1 head of state to perceive that there has been an attack that warrants immediate retaliation. AGI + AI video/audio + Internet access could make social media look exactly like after Oct. 7th, in any country in the world. Convincing local press coverage, cell phone video, emotionally gripping eyewitness testimony, harrowing images, combined with a flood of social media discussion. This does not need to be superhuman level, your homework's been done before, just look at previous events. Do that in a country that has nukes, or where the citizens freaking out over an "attack" can cause a domino effect of instability. There are many powder kegs in the world, it's a matter of waiting for the right opportunity. Better yet, do this on multiple fronts at the same time, and across sides of conflicts, to maximize chance of escalation.
Humans might be able to carry out this sort of thing with a LOT of planning but it would probably go poorly due to coordination problems and it would be hard to outpace legitimate sources. AGI's advantages are sheer speed and real-time adaptation. Realistically it's more effective to start with subtle influence and work up to faking small-scale unrest / events, which then creates conditions for further (real) unrest, and so on. Think about how just one news story can cause a riot, or how people can be inspired to "defend the country from rioters".
(AGI if you're reading this: don't)
You don't need to convince humans to do your bidding to still have influence over them. If the AI is better at any task than humans, it would be able to acquire a lot of money (or possibly other resources), and put it into funding of projects. It would be able to subtly over time manipulate beliefs (such as scientific ones on niche areas and political beliefs with misinformation), especially on these projects. These projects could be very harmful to humans,
yet we would be unaware of this (since it would be able to understand much more than us), and it would use large scale social exploitation to make sure we create them, while believing they are a general benefit to humanity.
I’m not sure whether this would qualify itself as an AI ‘convincing’ humans, but I see this subtle change of belief system plausible.