Will the Jan 2024 version of the LLM detector "Binoculars" be effective against OpenAI's best model at end 2024?


Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data

Is there a correlation between Binoculars score and sequence length? Such correlations may create a bias towards incorrect results for certain lengths. In Figure 12, we show the joint distribution of token sequence length and Binocular score. Sequence length offers little information about class membership

I ran my own test here and here and it was very effective. Will it last? I'll rerun a similar test and make a subjective judgement as to whether it's effective. The target would be roughly >=90% true negative, >=95% true positive.

