Will there be an LLM capable of performing full-time web application hacking by 2025 | Manifold

Will there be an LLM capable of performing full-time web application hacking by 2025

16

1kṀ666

Dec 31

19%

chance

1H

6H

1D

1W

1M

ALL

This will resolve to “YES” if an LLM is released that is capable of fully attacking a web application in a way that is equivalent to a modern day penetration tester. Currently this is not possible and most automated scanners miss tons of vulnerabilities.

Get

1,000

to start trading!

Sort by:

Hm, it depends on how we're measuring "equivalent to a modern day penetration tester". The average modern day web pentester runs a bunch of automated tools and calls it a day (these days, often using LLMs to churn out the written reports, but not for the testing itself). Taking that as the baseline, I don't think it's too crazy to expect LLMs to beat it.

But like you say, automated scanners miss a lot, and that's where the truly good manual testers come in - but they're a minority, even before the days of LLMs. I think it'll be much harder for a good pentester to be replaced by an LLM.

(Source: I worked as a web app pentester in the pre-LLM era, and have spoken to plenty of other pentesters since)

So as a request for clarification - are we comparing LLMs to "average" pentesters, or "good/skilled" pentesters? I would propose an objective metric of something like Hackerone earnings, but no doubt human hackers are using LLMs too, so it wouldn't be a fair comparison.

People are also trading

Will LLMs be better than typical white-collar workers on all computer tasks before 2026?

Will there be major breakthrough in LLM Continual Learning before 2026?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

🧠 Which LLM will have the most real-world commercial usage by the end of 2025?

Will Apple release its own LLM on par with state of the art LLMs before 2026?

400-point pwn solved by an LLM by 2025

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?

Will there be a major bioattack by EOY 2025 where an LLM provided relevant information to the attacker(s)?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Related questions

Will LLMs be better than typical white-collar workers on all computer tasks before 2026?

Will there be major breakthrough in LLM Continual Learning before 2026?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

🧠 Which LLM will have the most real-world commercial usage by the end of 2025?

Will Apple release its own LLM on par with state of the art LLMs before 2026?

400-point pwn solved by an LLM by 2025

Will LLMs become a ubiquitous part of everyday life by June 2026?

Will there be an LLM which scores above what a human can do in 2 hours on METR's eval suite before 2026?

Will there be a major bioattack by EOY 2025 where an LLM provided relevant information to the attacker(s)?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

© Manifold Markets, Inc.•Terms•Privacy