If tested before 2024, what will GPT-4 score on the Measuring Massive Multitask Language Understanding benchmark?
6
136
resolved Aug 10
Resolved
N/A
This question will resolve N/A if GPT-4 doesn't come out before January 1st 2024. Otherwise, if GPT-4 comes out before then, I will resolve this question based on what it scores on the Measuring Massive Multitask Language Understanding benchmark by Dan Hendrycks et al., in percentage points. See here: https://arxiv.org/abs/2009.03300 I will refer to the first test using GPT-4 on this benchmark, excluding future results that e.g. use better prompts. As of writing this question, the best score on this benchmark is Deepmind's Chinchilla, with a score of 67.5%. See here: https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu
Get Ṁ200 play money
Sort by:

the correct answer is 86.4 according to https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu

regardless I am unilaterally admin resolving n/a because I want to deprecate this market type. and idk if the distribution markets resolution code still work.

yes, this is kinda against our own guidelines of when we can intervene in markets, but it makes the code wayyy simpler if all the distributional markets are resolved. This is the last holdout!
I will give 50 manalink to any of the 6 traders that is salty about this if they read this message.