My plan is to open up Leetcode, go to the most recent contest, and then give GPT4 the exact prompts from 3/4 problems and ask it to solve them in Python. I may do some amount of prompt twiddling (e.g. adding "let's think step-by-step" or other tricks people post) but the test will be zero-shot. The problem will count as solved if GPT4's solution passes the Leetcode automated evaluation.

EDIT #1: If I somehow learn that the solutions are in the training set, I'll resolve this N/A or try to find problems for which this is not the case.

EDIT #2: I changed this to be from a recent contest so that there're no concerns about the questions being in the training set.

Nov 16, 10:34pm: ~~Out of three Leetcode contest problems, how many problems will GPT4 solve on its given the exact prompt from the problem?~~ → Out of three Leetcode contest problems, how many problems will GPT4 solve given the exact prompt from the problem?

So... I went and did this for GPT3 and it got 1/3. It got: https://leetcode.com/problems/first-missing-positive/, but didn't get: https://leetcode.com/problems/maximal-rectangle/ and https://leetcode.com/problems/regular-expression-matching/. If Leetcode really isn't in the training set, this means I haven't updated nearly hard enough.

@StephenMalina this led me to also change the question to be for a recent contest rather than an existing Q.

@StephenMalina This made me update on the hardness of "hard" Leetcode problems (I have not used it myself and I assumed the problems would actually be hard.)

@DavidBolin there's a fair bit of variation. I changed it to contest problems, which I think will be have a higher minimum hardness though (although I don't participate in these competitions so this is based on impression not experience).

My current bet is that there's a 60% chance it won't be able to solve any, a 15% chance it'll be able to solve 1, a 15% chance it'll solve 2, and a 10% chance it'll solve 3.