Will any model get above human level (92%) on the Simple Bench benchmark before September 1st, 2025.
Mini
22
3.9k
2025
60%
chance

Get Ṁ1,000 play money
Sort by:
bought Ṁ250 NO

Seems unlikely without a major paradigm shift. 27% is sota and it doesn't seem to be increasing much with successive model generations

Is it true that this benchmark can be anything, and can be changed at any point? There are no hashes, no large sample of problems, no error bars, no evaluation code, no specifics on what a model can or cannot use... How do we know what a true performance is, except what the author says?

@dp gotta trust the guy

bought Ṁ532 YES

Description of the benchmark here: https://simple-bench.com/about.html

I have made some irrational bets to subsidize the market - as I cannot be bothered to figure out the correct way to do this.

I think you can normally just add liquidity?