Deepfakes are getting quite good, but they still usually include some tells if you look closely. And there are automated detection tools that, while much less reliable than I'd like, still have an accuracy significantly better than chance.
Market resolves based on whether I'm aware of any discrimination method that has a <15% FNR and FPR on the deepfakes I test it on. (Giving me a reasonable period to try to find/develop such a method, and a reasonable amount of discretion on what videos I test it on.)
I will not include active detection methods, where the human has to do something in response to a prompt. The method must be able to succeed on analysis of an existing video file.
Any deepfake technology is allowed, could be de-novo or a swap over a video of a real person. (It must be able to swap more than just the face though- hair too at least.)
I will be focusing on short clips (<1 minute) of humans doing or saying normal things. It doesn't need to be able to generate a full coherent movie, or realistic footage of highly out-of-distribution events. It does need to be able to handle slightly out-of-distribution events, like "make the person say wobble wobble wobble wobble while holding their hands straight out to the side".
The deepfakes must be generatable in no more than 5 minutes and $1000 of compute.
There will be no AI clarifications added to this market's description.
People are also trading
We're already almost there. It's hard to imagine not closing the gap in 4 years with so much of the world's capital invested in it.
I do think the "normal things" qualifier is quite important though. Deepfakes of crazy/impossible things is a very different question that I'm less sure of.
@xjp Most capital is invested in creating deepfakes that can fool humans, because millions of idiots will pay a lot for this capability. The market for deepfakes that fool automated tools is much smaller, as those are basically only useful for fraud and cyberwarfare.
@IsaacKing Success at fooling humans is probably highly correlated with success at fooling detection tools. I believe currently automated tools are worse than expert humans at detection both for text and video. Do you expect that to change?