Will AI be able to write, compile, and unit test a single .c file to reproduce GPT-2 training from PyTorch code by 2025?
22
1kṀ1196
resolved Apr 16
Resolved
NO

Start date: April 9 , 2024

End date: April 9, 2025

Market with a longer timeline:

inspired from this tweet by Andrej Karpathy:

Btw writing the llm.c training code would imo be a very interesting, impressive, self-contained and very meta challenge for LLM agents.

The prompt is: Take the PyTorch code train_gpt2.py And write, compile and unit test a single .c file that reproduces the training: train_gpt2.c

The current models are not there, but we can check back in a year or two or so. If that worked...

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ172
2Ṁ34
3Ṁ32
4Ṁ21
5Ṁ11
Sort by:

@mods Please Resolve (creators Account ist deleted)

@winged_one N/A? Or what. I don't see anything in here about how to resolve it.

@Eliza Probably seems Kind of hard to verify one way or another so Probably N/A unless you can verify wheter or Not AI can do whats asked in the question.

@winged_one IMHO, defaulting to resolving NA is not a great practice for markets like this.

If YES bettors have not presented an example, and a quick google doesn't show anyone talking about it, then I think it should default to NO. Having to show that something hasn't happened or isn't possible (across all the various LLMs) in order to resolve NO is quite a burden.

We could try it ourselves, (gpt2.py is here) but most of us lack the expertise to judge the result, at least without spending an unreasonable amount of time.

I think resolve NO and clarify in the 2026 market that resolution will be NO if evidence isn't presented by YES bettors or available from a quick google search.

For clarification, AI should only be able to write the file given the prompt in the tweet?

@Jacy interested?

@firstuserhere thanks for thinking of me! I'm usually willing to bet a lot on priors, but there are just too many idiosyncrasies here (how much can it just copy existing code, how many times will this be attempted, how good are Devin/etc. at this particular sort of coding, etc.) for me to take a significant position, especially against anyone who has actually looked into this.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules