Quoting Jack Clark
The idea of DarkBERT is that the dark web has a different data distribution to the so-called surface web and so the hypothesis is by pre-training on a dark web corpus you’ll end up with a model better at spotting things like drugs, credit card counterfeiting, hacking, and other internet-underbelly activities. In tests, DarkBERT does marginally better than standard BERT and RoBERTa classifiers, so the research is promising but not mind blowing.
automated spies for the underbelly of the world: AI systems let us take a given thing we’d like a human to do and instead outsource that to a machine. Systems like DarkBERT point to a world where police and intelligence forces train a variety of ‘underbelly’ models to go and read (today), listen (also today – see Facebook’s speech recognition system), and look (soon, as people tie language models to vision systems) at the world, continually analyzing it for increasingly rich and complex harms.
Still, why only social media? No way the "underbelly" would be on any of those platforms—and it certainly isn't Reddit's job to find and remove inappropriate content, right? geometry dash