Short-term AI 3.3: By June 2024 will SOTA on HumanEval be >= 99%?
10
190Ṁ685
resolved Jun 5
Resolved
NO

Benchmark. SOTA at market creation is 94.4%.

Other short-term AI 3 markets:

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ40
2Ṁ14
3Ṁ9
4Ṁ7
5Ṁ5
Sort by:

@vluzko resolves NO

@mods very inactive creator, resolves NO, proof above

@Ziddletwix Does the source you got that from also have info on the other two markets (math/apps)?

@PlasmaBallin I'm just following the links in the description. Which should be straightforward (can post pics) but it's possible the linked source could be missing models? (i see one comment noting another). but maybe it's easiest just to go on the linked source

The last few % points are always the hardest. Unless, of course, you train on the validation set.

@thooton I think it's quite plausible that the test set will end up in the training set in some hard to detect way. I will exclude models for this if it's known their training set is poisoned (I assume Papers With Code would exclude them as well), but for most large language models the pre-training data is not public.

© Manifold Markets, Inc.TermsPrivacy