If you don't want to read the full description, the short version is this: The probability of each time category represents the percentage of people that will be worse at problem-solving than an AI that doesn't use extreme amounts of energy.
Codeforces contests will be used to measure the performance.
Competitive Programming and AI
AIs as of 2024 are good at retrieving previously discovered knowledge, they lack however the ability to navigate complex state spaces to arrive at new solutions.
CP problems provide a good framework for testing them in this area, as we can easily test their solutions with a computation.
As such, their rating in Codeforces will also reflect to some extent their ability to solve complex problems relative to other humans.
AI Qualification Criteria
fully automated agent that uses Codeforces API to read statements and submit solutions
to prevent the abuse of inefficient and massive computations (similar to AlphaCode) that may as well be equivalent to a large team of humans, the AI is limited to the use of resources worth at most $80 per contest
Scoring and Resolution
The rating will be evaluated based on the median from past 5 contests or based on the rating on the Codeforces website.
If the rating of the best agent at the end of the given year (or some other time) is x
then that resolves to the percentage of people with less rating than x
.