Out of three Leetcode contest problems, how many problems will GPT4 solve given the exact prompt from the problem?

My plan is to open up Leetcode, go to the most recent contest, and then give GPT4 the exact prompts from 3/4 problems and ask it to solve them in Python. I may do some amount of prompt twiddling (e.g. adding "let's think step-by-step" or other tricks people post) but the test will be zero-shot. The problem will count as solved if GPT4's solution passes the Leetcode automated evaluation.

EDIT #1: If I somehow learn that the solutions are in the training set, I'll resolve this N/A or try to find problems for which this is not the case.

EDIT #2: I changed this to be from a recent contest so that there're no concerns about the questions being in the training set.

Nov 16, 10:34pm: Out of three Leetcode contest problems, how many problems will GPT4 solve on its given the exact prompt from the problem? → Out of three Leetcode contest problems, how many problems will GPT4 solve given the exact prompt from the problem?

StephenMalina avatar0
31%
StephenMalina avatar3
28%
StephenMalina avatar1
25%
StephenMalina avatar2
15%
Sort by:
StephenMalina avatar
Stephen Malina
bought Ṁ35 of 0

So... I went and did this for GPT3 and it got 1/3. It got: https://leetcode.com/problems/first-missing-positive/, but didn't get: https://leetcode.com/problems/maximal-rectangle/ and https://leetcode.com/problems/regular-expression-matching/. If Leetcode really isn't in the training set, this means I haven't updated nearly hard enough.

StephenMalina avatar
Stephen Malina
bought Ṁ30 of 1

@StephenMalina this led me to also change the question to be for a recent contest rather than an existing Q.

StephenMalina avatar
Stephen Malina
bought Ṁ5 of 2

@StephenMalina that said, I still updated my probabilities here to weigh 1 & 2 more highly.

DavidBolin avatar

@StephenMalina This made me update on the hardness of "hard" Leetcode problems (I have not used it myself and I assumed the problems would actually be hard.)

StephenMalina avatar

@DavidBolin there's a fair bit of variation. I changed it to contest problems, which I think will be have a higher minimum hardness though (although I don't participate in these competitions so this is based on impression not experience).

StephenMalina avatar

My current bet is that there's a 60% chance it won't be able to solve any, a 15% chance it'll be able to solve 1, a 15% chance it'll solve 2, and a 10% chance it'll solve 3.