
Short-term AI 3.4: By June 2024 will SOTA on APPS be >= 25%?
8
130Ṁ1297resolved Jun 8
Resolved
NO1H
6H
1D
1W
1M
ALL
APPS is the more challenging code benchmark (compared to HumanEval). SOTA at market creation is 15.7 by CodeRL. I will use Competition Pass@any.
Notable that the current SOTA is using a very old LLM as the base model, and yet it still beats davinci-002.
Other short-term AI 3 markets:
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
| # | Name | Total profit |
|---|---|---|
| 1 | Ṁ171 | |
| 2 | Ṁ31 | |
| 3 | Ṁ21 | |
| 4 | Ṁ11 | |
| 5 | Ṁ5 |
Sort by:
@PlasmaBallin according to the linked source this is NO (not sure if there's any reason to include other models)

People are also trading
Related questions
BIG-bench accuracy 75% #3: Will SOTA for a single model on BIG-bench pass 75% by the start of 2026?
83% chance
What will be true of the SOTA AI on the FrontierMath benchmark, before 2026?
MMLU 99% #3: Will SOTA for MMLU (average) pass 99% by the start of 2026?
6% chance
What will be true of the SOTA AI on the FrontierMath benchmark, before 2028?
BIG-bench accuracy 75% #4: Will SOTA for a single model on BIG-bench pass 75% by the start of 2027?
86% chance
What will be true of the SOTA AI on the FrontierMath benchmark, before 2027?
BIG-bench accuracy 75% #5: Will SOTA for a single model on BIG-bench pass 75% by the start of 2028?
87% chance
Any SOTA AI model uses human-understandable thinking medium at the end of 2028?
71% chance
SOTA AI at EOY 2026 a reasoning model?
94% chance
MMLU 99% #4: Will SOTA for MMLU (average) pass 99% by the start of 2027?
8% chance