Will GPT-4 be a superhuman coder?

610Ṁ5798

resolved Jul 9

Resolved

ALL

Be it on APPS (https://arxiv.org/abs/2105.09938) OR on Codeforces (see https://arxiv.org/abs/2203.07814), will GPT-4 (the first released version by OpenAI) be a superhuman coder?

By superhuman, I mean that it beats human best experts (e.g it ranks first on Codeforces competition, or it scores more than 90% in top-1 accuracy on Competition Level problems from APPS).

Close date updated to 2023-04-30 12:59 am

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ34
2		Ṁ33
3		Ṁ21
4		Ṁ16
5		Ṁ14

People are also trading

Will GPT-5 be capable of achieving superhuman performance in at least one exam that is typically taken by humans?

91% chance

Will GPT-5 be able to replace the average web developer?

15% chance

Will GPT-4 escape?

6% chance

Will GPT-5 reach a 1000 rating on Codeforces?

96% chance

Will GPT-4 be legally considered AGI?

7% chance

Will an open source model beat GPT-4 in 2024?

76% chance

Is GPT-4 best? (Thru 2025)

63% chance

Will LLMs such as GPT-4 be seen as at most just a part of the solution to AGI? (Gary Marcus GPT-4 prediction #7)

Sort by:

The GPT-4 report (page 5) says it got a rating of 392 on Codeforces (compared to GPT 3.5 at 260). I don't know exactly what that means but I'm gathering that it's not very impressive?

@Imuli That is in the bottom 5%.

Edit: There have been some questions about how OpenAI conducted the test, because it seems unreasonably low. But disregarding output speed, there is no circumstance under which GPT-4 could be considered a superhuman coder,

Can you add more clarity on what it means to be superhuman? I imagine Codex is already better than most humans at coding because most humans can't code!

predictedNO

@PeterHurford I just added something! Let me know if it's not enough: "By superhuman, I mean that it beats human best experts (e.g it ranks first on Codeforces competition)."!

@SimeonCampos What would be the equivalent for APPS? 100% top-1 accuracy on competition problems?

predictedNO

@PeterHurford Yeah there's no human baseline unfortunately and 100% seems a lot. Maybe >90%? I don't know how good humans would be given the presence of IOI problems, but probably not 100%.