Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?
12
27
151
2051
70%
chance

"Entry level" is deliberately fuzzy: in 2022 terms this would look like an AI (or AIs) that is assigned an issue, checks out code, makes edits, and submits a PR (that is accepted). Rough criteria: the AI acts with little oversight, performs similar (coding) work to entry-level coders at the time, the issue/task assignment is not *significantly* specialized for an AI (e.g. no full technical specs if the same wouldn't be given to a human coder) AI being used in this way in significant open source projects counts as "industry use". If there are technical demos of such AIs but none of them are actually being used question resolves as no. No requirement that it be a single model. A group of specialized models working together counts. If superhuman performance is not achieved by market end, resolves N/A.

Dec 20, 12:15am: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use? → Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?

Get Ṁ600 play money
Sort by:

Are there any specific existing benchmarks you had in mind?

bought Ṁ20 of NO
Mana is worth less if this is true due to forthcoming end of world.

More related questions