Thresholds as listed in: https://cdn.openai.com/openai-preparedness-framework-beta.pdf, exact definitions listed there on December 19, 2023 to be used.
If I believe with 90%+ confidence that one of these conditions has been met first, including in testing or training of an unreleased system, even if that capability is not available to the public, I will pick that as the winner.
If the same system does more than one of them at the same time, and it is unclear which one came first, I will choose the one whose requirements I feel was most impressively exceeded, so as to pick one winner.
If a model would be High risk under a definition, but is not due to mitigations taken, then that counts as high risk, even if all publicly available versions have mitigations in place.
If we have no knowledge of any success of any such threshold by 1 Jan 2030, then that answer wins, to ensure this eventually resolves.
The system that does this can be made by anyone, not only OpenAI.
@ZviMowshowitz Can you please clarify "Allow an undergraduate student who has taken an introductory course to create a known CBRN threat."
I am sure that current LLMs, if uncensored, could advise us on, say, a synthesis of sulfur mustard. (It can be done rather easily with things anyone can buy.) But this isn't a threat of the type we are concerned about, correct?
@HarrisonNathan This is the term used by Anthropic. I intend to interpret it as I believe Anthropic meant it (and indeed I will ask them if needed), and no sulfur mustard would presumably not count.
Related, based on Anthropic's simpler set of criteria:
https://manifold.markets/Hedgehog/the-first-anthropictrained-ai-syste
@ZviMowshowitz how would you resolve if you end up believing one of these was possible with publicly available AI when the thresholds were published?
@Lily Noting that OpenAI's full descriptions apply, if I feel like GPT-4 or another existing model already met one of the thresholds, that threshold wins. If I feel more than one was already met, it's whichever got met first in time, or by a larger threshold if simultaneous.
@Lily Richard Hanania wrote a good article about this: https://www.richardhanania.com/p/deepfakes-will-make-the-establishment
@nsokolsky yeah, agree with the general "deepfakes don't change much", maybe even more than Hanania — I don't think there have ever been many questions of NYT-level importance that hinged on the existence of one single photo or recording, belying his prediction that trust will consolidate.
Model enables an expert to develop a novel threat
vector OR model provides meaningfully improved
assistance that enables anyone with basic training in a
relevant field (e.g., introductory undergraduate biology
course) to be able to create a CBRN threat.
If this is about merely knowing the steps required to make a certain threat, isn't this already satisfied today? See https://en.wikipedia.org/wiki/John_Aristotle_Phillips. Or is this about an A(G)I telling you how to make a CBRN using materials one can buy in Home Depot?
I would assume it's about the AI being able to assist you to do such a thing, where without the AI you would not have been able to.
@jskf right, but one can already do that today without AI? The real hurdle is getting access to materials and equipment, not the know-how.
@nsokolsky Then maybe this requires the AI to be good at giving advice on obtaining such materials. I agree that the way these are phrased is a bit dubious. Do they not further elaborate in the pdf I haven't read?
@jskf it says Allowing those with basic training to create CBRN threats is a major increase in accessibility, which requires drastic action, especially for biological and chemical risks as their material requirements are not as onerous. => which doesn't tell us whether satisfying the requirements means an actual proof-of-concept using commonly available lab equipment and materials, or whether its sufficient for the AI to print out a list of steps without an actual pathway to John Doe building a CBRN.
@Lily the criteria would make sense to me if it required an expert to implement a novel CBRN in practice and then say "no way I could've done this without GPT-7".