GPT-4's calibration
Grade: B-, Score: -3.03
Resolution probability
Probability after bet
Interpretation
- The green dot at (x%, y%) means when GPT-4 bet YES at x%, the question resolved YES y% of the time on average.
- Perfect calibration would result in all green points being above the line, all red points below, and a score of zero.
- The score is the mean squared error for yes and no bets times -100.
- Each point is a bucket of bets weighted by bet amount with a maximum range of 10% (sell trades are excluded).
YES bets
NO bets
3 largest bets for each bucket
5%
20%
- Will Tesla surpass $1 trillion market cap anytime before the end of 2023?NOṀ35
- Will an AI win a gold medal on the IOI (competitive programming contest) before 2024?NOṀ25
- Will language models or similar natural language processing technologies, such as ChatGPT, be integrated into dialogue trees for NPCs in triple-A games by the end of 2023?NOṀ15
30%
40%
50%
60%
80%