
HumanEval 90% #2: Will pass@1 performance on the HumanEval benchmark be >= 90% by 2025?
6
closes 2025
77%
chance
1D
1W
1M
ALL
Benchmark link: https://paperswithcode.com/sota/code-generation-on-humaneval
pass@1 means the model gets a single attempt.
Sort by:
1 NO payouts

Related markets
Benchmark Gap #2: Once we have an algorithm with human level sample efficiency for major RL benchmarks, how many years will it be before there is an algorithm with human level sample efficiency on essentially all AAA video game tasks?1.6
Will any AI be able to explain formal language proofs to >=50% of IMO problems by the start of 2025?63%
Related markets
Benchmark Gap #2: Once we have an algorithm with human level sample efficiency for major RL benchmarks, how many years will it be before there is an algorithm with human level sample efficiency on essentially all AAA video game tasks?1.6
Will any AI be able to explain formal language proofs to >=50% of IMO problems by the start of 2025?63%