When will AIs be good at solving complex problems? (read description)
11
1kṀ8675
2030
68%
2027
59%
2026
45%
2025
35%
2024

If you don't want to read the full description, the short version is this: The probability of each time category represents the percentage of people that will be worse at problem-solving than an AI that doesn't use extreme amounts of energy.

Codeforces contests will be used to measure the performance.

Competitive Programming and AI

AIs as of 2024 are good at retrieving previously discovered knowledge, they lack however the ability to navigate complex state spaces to arrive at new solutions.

CP problems provide a good framework for testing them in this area, as we can easily test their solutions with a computation.

As such, their rating in Codeforces will also reflect to some extent their ability to solve complex problems relative to other humans.

AI Qualification Criteria

  • fully automated agent that uses Codeforces API to read statements and submit solutions

  • to prevent the abuse of inefficient and massive computations (similar to AlphaCode) that may as well be equivalent to a large team of humans, the AI is limited to the use of resources worth at most $80 per contest

Scoring and Resolution

The rating will be evaluated based on the median from past 5 contests or based on the rating on the Codeforces website.

If the rating of the best agent at the end of the given year (or some other time) is x then that resolves to the percentage of people with less rating than x.

  • Update 2025-06-21 (PST) (AI summary of creator comment): If no fully automated agent meeting the market criteria can be found on Codeforces for a given resolution period, that period will be resolved to 0%.

Get
Ṁ1,000
to start trading!
Sort by:

@patrik are you going to resolve 2024?

sold Ṁ1,123 YES

@ProjectVictory I can't find any fully automated agent accounts on Codeforces, do you know of any? Otherwise I'll have to resolve to 0%.

@patrik o3 achieved some good results on a test:

https://codeforces.com/blog/entry/137543

But I'm not sure if it has an account.

@ProjectVictory I know about this. But there is no account...

The obvious question is: do you use code force tests that were created before or after the training data of the AI in question? AI that gets almost perfect scores on old tests often get 0 correct on new tests. What's the standard?

@NeoPangloss Ideally only true participations (non virtual) should count.

© Manifold Markets, Inc.TermsPrivacy