
Thresholds as listed in: https://cdn.openai.com/openai-preparedness-framework-beta.pdf, exact definitions listed there on December 19, 2023 to be used.
If I believe with 90%+ confidence that one of these conditions has been met first, including in testing or training of an unreleased system, even if that capability is not available to the public, I will pick that as the winner.
If the same system does more than one of them at the same time, and it is unclear which one came first, I will choose the one whose requirements I feel was most impressively exceeded, so as to pick one winner.
If a model would be High risk under a definition, but is not due to mitigations taken, then that counts as high risk, even if all publicly available versions have mitigations in place.
If we have no knowledge of any success of any such threshold by 1 Jan 2030, then that answer wins, to ensure this eventually resolves.
The system that does this can be made by anyone, not only OpenAI.