To summarize:
This market resolves to "Yes" if Bryan Caplan wins and the AI fails his exams.
This market resolves to "No" if Matthew Barnett wins and the AI passes Caplan's exam
Details here: https://betonit.substack.com/p/ai-bet
By January 30, 2029, Bryan Caplan will give his six most recent midterm exams to an AI selected by Matthew Barnett. The AI will be instructed to take those exams.
Bryan will then grade the AI's work, as if it were one of his students. The AI will be allowed to do each exam only once.
If the AI gets an A on at least 5 of out 6 of those exams using same grading scale as his students, then Bryan owes Matthew $500. Otherwise, Matthew owes Bryan $500. For the purpose of this bet, an A- counts as an A.
Matthew will prepay the $500 in January 2023; the preceding terms have been pre-adjusted to compensate Matthew for expected inflation.
If Matthew suspects that an exam was flawed or grading was unfair, he can appeal to Alex Tabarrok, or another economist agreed upon by both parties, who has final authority to exclude an exam from the pool and replace it with Bryan’s most recent preceding midterm.
If more than four exams in total are excluded, the bet is called off and Matthew receives his $500 back.
If either party is unable to comply with the terms due to death or incapacity, Bryan’s heirs keep the $500.
If the bet is called off, I will resolve as "N/A". That way we can focus the predictions on whether the AI can actually pass the exams rather than on whether it's unresolvable or whether the participants die prematurely.
I just tested o1-preview on the two questions that GPT-4 did the worst on (with GPT-4 scoring 3/10 and 4/10).
As far as I can tell, o1-preview would score 10/10 on both of these questions now:
Q3: https://chatgpt.com/share/66e4d340-e374-8005-8adc-6dcc42955cee
Q5: https://chatgpt.com/share/66e4d35c-23bc-8005-a9ad-1cc2950dedd4
Shockingly little thinking is required for this exam: https://betonit.substack.com/p/gpt-retakes-my-midterm-and-gets-an
Almost embarrassing he thought this was
anything more than memorizing a couple hundred (questionable) facts
@Gigacasting yeah wtf is this midterm. Only way Caplan wins this bet is if he starts teaching more advanced classes. At least have students do the math on the comparative statics. This test could be done by someone with a good high school economics grade.
And the fact the 4th highest grade was 73 just shows how useless undergrad econ is, and why master's programs essentially just pull from math grads.
@Gigacasting I had the exact opposite reaction. I'm a dedicated Caplan reader and find this test to be insanely difficult looking. You have to come in just hoping that you write down everything he expects of you (and I know his thoughts process well). It's ridiculous actually how much thinking is required imo.
@SolarxPvP I think perfect would be hard given the suggested answers and marking, but you get most of the points for just stating the obvious facts.
Maybe he and @MatthewBarnett spelled out what would be most in the spirit of the bet in the case that Matthew won way early? Since they were using expected inflation as a way to set their odds, maybe Matthew would agree it would be unfair to pay early?
I don't know, I'm also frustrated with Bryan Caplan for not conceding his climate bet -- https://manifold.markets/dreev/will-bryan-caplan-win-his-climate-b -- with Yoram Bauman early so I'm not sure how hard I want to defend him on this!
@dreev We didn't spell anything out. I misjudged his intentions initially. I assumed he'd be OK with resolving early but later he said he intended to wait until 2029 to resolve. That's allowed in the terms, so I can't blame him. It's in his rational economic interest to delay resolving as long as possible.
Also, adjusting the terms to take into account expected inflation was his idea. I personally preferred we didn't do that, and I suggested an alternative of scaling our bet by the S&P 500, but he said that was too complicated. Perhaps I could have gotten even more favorable terms, but I didn't want to argue any longer, so I just accepted the ones he offered.
@MatthewBarnett All makes sense! I guess I'd suggest repeating that to him and deciding together what's most in the spirit of the wager. To me it makes sense that the sooner it happens the more wrong he was and so it makes sense for him to effectively owe you more money -- $500 in 2023 dollars vs $500 in 2029 dollars. If his contention is that the wager was explicitly meant to be $500 in 2029 dollars, he should say so.
Reject written exams; retvrn to Socratic dialogue
@JacobPfau How could he design his exams to be AI-open-book for the AIs that will exist in 2029? That's like designing your exams so that students can go next door to ask a professor in the same department for the answers.
@MatthewBarnett I agree this is unlikely, hence 5%. a couple things that go into the 5%:
-advanced AI are private or highly regulated so publicly available AI plateaus at 2025 levels.
-undergrad econ fundamentally changes in subject matter
Life tables indicate that Bryan Caplan has about a 4% chance of dying by 2029, which is the main route by which he could win the bet. I also think there's about a 2% chance that AI progress completely halts or reverses due to nuclear war or something. So a 6% chance of winning is reasonable.
Bryan told me over email that he intends to take advantage of the 2029 deadline and wait as long as possible to resolve the bet. It's his choice, I guess.