Short-term AI 3.4: By June 2024 will SOTA on APPS be >= 25%?
6
23
130
Jun 2
25%
chance

APPS is the more challenging code benchmark (compared to HumanEval). SOTA at market creation is 15.7 by CodeRL. I will use Competition Pass@any.

Notable that the current SOTA is using a very old LLM as the base model, and yet it still beats davinci-002.

Other short-term AI 3 markets:

Get Ṁ200 play money

More related questions