What Brier score will the best model achieve in the $125k Autocast warmup competition?
Basic
3
Ṁ48
resolved Mar 8
100%3%
Above 70
55%Other
0.0%
85
0.0%
50
0.0%
35
3%
Above 85
0.0%
Above 80
0.0%
Above 75
0.0%
Above 65
3%
Above 60
0.0%
Above 55
3%
Above 50
3%
Above 40
3%
Above 30
3%
Above 25
27%
Above 20

This question will resolve to the minimum Brier score achieved on the leaderboard by a qualified submission. The calibrated random baseline is 85. The lower the better. See an example of the Brier score in action.

See the competition page.

From the Evaluation page

For true/false and multiple-choice questions, we evaluate models using the Brier score, which is then divided by 2 to normalize between 0% and 100%. For numerical questions, we use L1 distance, bounded between 0% and 100%. We denote these question types as T/F, MCQ, and Numerical, respectively. To evaluate aggregate performance, we use a combined metric (T/F + MCQ + Numerical), which has a lower bound of 0%. A score of 0% indicates perfect prediction on all three question types. For more details, please check out the Autocast paper.

Get
Ṁ1,000
and
S1.00
Sort by:

Those don’t look like Brier Scores to me.