
Be it on APPS (https://arxiv.org/abs/2105.09938) OR on Codeforces (see https://arxiv.org/abs/2203.07814), will GPT-4 (the first released version by OpenAI) be a superhuman coder?
By superhuman, I mean that it beats human best experts (e.g it ranks first on Codeforces competition, or it scores more than 90% in top-1 accuracy on Competition Level problems from APPS).
Close date updated to 2023-04-30 12:59 am
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ34 | |
2 | Ṁ33 | |
3 | Ṁ21 | |
4 | Ṁ16 | |
5 | Ṁ14 |
People are also trading
The GPT-4 report (page 5) says it got a rating of 392 on Codeforces (compared to GPT 3.5 at 260). I don't know exactly what that means but I'm gathering that it's not very impressive?
@Imuli That is in the bottom 5%.
Edit: There have been some questions about how OpenAI conducted the test, because it seems unreasonably low. But disregarding output speed, there is no circumstance under which GPT-4 could be considered a superhuman coder,
@PeterHurford I just added something! Let me know if it's not enough: "By superhuman, I mean that it beats human best experts (e.g it ranks first on Codeforces competition)."!
@PeterHurford Yeah there's no human baseline unfortunately and 100% seems a lot. Maybe >90%? I don't know how good humans would be given the presence of IOI problems, but probably not 100%.