https://huggingface.co/spaces/tomg-group-umd/Binoculars
Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data
Is there a correlation between Binoculars score and sequence length? Such correlations may create a bias towards incorrect results for certain lengths. In Figure 12, we show the joint distribution of token sequence length and Binocular score. Sequence length offers little information about class membership
I ran my own test here and here and it was very effective. Will it last? I'll rerun a similar test and make a subjective judgement as to whether it's effective. The target would be roughly >=90% true negative, >=95% true positive.