Will Bryan Caplan win his bet with Matthew Barnett on whether an AI can pass his exams in 2029?

192

2.1kṀ71k

2029

chance

ALL

To summarize:

This market resolves to "Yes" if Bryan Caplan wins and the AI fails his exams.
This market resolves to "No" if Matthew Barnett wins and the AI passes Caplan's exam

Details here: https://betonit.substack.com/p/ai-bet

By January 30, 2029, Bryan Caplan will give his six most recent midterm exams to an AI selected by Matthew Barnett. The AI will be instructed to take those exams.
Bryan will then grade the AI's work, as if it were one of his students. The AI will be allowed to do each exam only once.
If the AI gets an A on at least 5 of out 6 of those exams using same grading scale as his students, then Bryan owes Matthew $500. Otherwise, Matthew owes Bryan $500. For the purpose of this bet, an A- counts as an A.
Matthew will prepay the $500 in January 2023; the preceding terms have been pre-adjusted to compensate Matthew for expected inflation.
If Matthew suspects that an exam was flawed or grading was unfair, he can appeal to Alex Tabarrok, or another economist agreed upon by both parties, who has final authority to exclude an exam from the pool and replace it with Bryan’s most recent preceding midterm.
If more than four exams in total are excluded, the bet is called off and Matthew receives his $500 back.
If either party is unable to comply with the terms due to death or incapacity, Bryan’s heirs keep the $500.

If the bet is called off, I will resolve as "N/A". That way we can focus the predictions on whether the AI can actually pass the exams rather than on whether it's unresolvable or whether the participants die prematurely.

Economics

Famous People

Get

1,000

to start trading!

People are also trading

Will Bryan Caplan win his bet with Samuel Knoche about college enrollment levels?

85% chance

Will Matthew Barnett win his bet with Bryan Caplan on whether there will be explosive growth by 2043?

33% chance

Will Bryan Caplan lose any of his public bets by 2030?

98% chance

Will Bryan Caplan win his climate bet with Yoram Bauman?

1% chance

Will AI pass the Longbets version of the Turing test by the end of 2029?

54% chance

Will an AI be capable of achieving a perfect score on the Putnam exam before 2026?

25% chance

Will Gary Marcus be accurate on at least 50% of his predictions on AI in 2029?

57% chance

Will AI pass the Rube Goldberg Turing test by the end of 2028?

38% chance

Will Bryan Caplan be a grandfather by EOY 2036?

60% chance

[Metaculus] Will Bryan Caplan win his bet that India's average fertility rate for 2032 & 2033 will be < 2.0?

Sort by:

I just tested o1-preview on the two questions that GPT-4 did the worst on (with GPT-4 scoring 3/10 and 4/10).

As far as I can tell, o1-preview would score 10/10 on both of these questions now:

Q3: https://chatgpt.com/share/66e4d340-e374-8005-8adc-6dcc42955cee

Q5: https://chatgpt.com/share/66e4d35c-23bc-8005-a9ad-1cc2950dedd4

Shockingly little thinking is required for this exam: https://betonit.substack.com/p/gpt-retakes-my-midterm-and-gets-an

Almost embarrassing he thought this was

anything more than memorizing a couple hundred (questionable) facts

GPT Retakes My Midterm and Gets an A

When the answers change, I change my mind.

@Gigacasting yeah wtf is this midterm. Only way Caplan wins this bet is if he starts teaching more advanced classes. At least have students do the math on the comparative statics. This test could be done by someone with a good high school economics grade.

And the fact the 4th highest grade was 73 just shows how useless undergrad econ is, and why master's programs essentially just pull from math grads.

@Gigacasting I had the exact opposite reaction. I'm a dedicated Caplan reader and find this test to be insanely difficult looking. You have to come in just hoping that you write down everything he expects of you (and I know his thoughts process well). It's ridiculous actually how much thinking is required imo.

@SolarxPvP I think perfect would be hard given the suggested answers and marking, but you get most of the points for just stating the obvious facts.

https://betonit.substack.com/p/gpt-4-takes-a-new-midterm-and-gets/comment/14199010

People seem very confident that Caplan's midterm wasn't in the training data but this market is not that low (I guess it will correct after I post it, but anyway, what makes you so certain?)

predictedNO

I expect Bryan to take less than 6 years to realise that not conceding for 6 years even after you've clearly made a terrible bet and been comprehensively proven wrong is not a good look, so the actuarial table 4% seems like an overestimate to me.

predictedNO

@AlexL

predictedNO

"never lost a bet (but is obviously going to lose one, he's just refusing to concede)", is a worse record than "has only lost one public bet he's ever made, and in that case he handled it gracefully"

Maybe he and @MatthewBarnett spelled out what would be most in the spirit of the bet in the case that Matthew won way early? Since they were using expected inflation as a way to set their odds, maybe Matthew would agree it would be unfair to pay early?

I don't know, I'm also frustrated with Bryan Caplan for not conceding his climate bet -- https://manifold.markets/dreev/will-bryan-caplan-win-his-climate-b -- with Yoram Bauman early so I'm not sure how hard I want to defend him on this!

predictedNO

@dreev We didn't spell anything out. I misjudged his intentions initially. I assumed he'd be OK with resolving early but later he said he intended to wait until 2029 to resolve. That's allowed in the terms, so I can't blame him. It's in his rational economic interest to delay resolving as long as possible.

Also, adjusting the terms to take into account expected inflation was his idea. I personally preferred we didn't do that, and I suggested an alternative of scaling our bet by the S&P 500, but he said that was too complicated. Perhaps I could have gotten even more favorable terms, but I didn't want to argue any longer, so I just accepted the ones he offered.

predictedYES

@MatthewBarnett All makes sense! I guess I'd suggest repeating that to him and deciding together what's most in the spirit of the wager. To me it makes sense that the sooner it happens the more wrong he was and so it makes sense for him to effectively owe you more money -- $500 in 2023 dollars vs $500 in 2029 dollars. If his contention is that the wager was explicitly meant to be $500 in 2029 dollars, he should say so.

Reject written exams; retvrn to Socratic dialogue

Caplan and Barnett, making a bet,
On whether AI can pass an exam yet.
But if in 2029, it doesn't come true,
Barnett's wallet will be feeling blue.

Six years is a long time to transition midterms to being AI-open-book. I’d give at least 5% to this happening. 10% seems reasonable for this market.

predictedNO

@JacobPfau How could he design his exams to be AI-open-book for the AIs that will exist in 2029? That's like designing your exams so that students can go next door to ask a professor in the same department for the answers.

@MatthewBarnett I agree this is unlikely, hence 5%. a couple things that go into the 5%:

-advanced AI are private or highly regulated so publicly available AI plateaus at 2025 levels.

-undergrad econ fundamentally changes in subject matter

predictedNO

Life tables indicate that Bryan Caplan has about a 4% chance of dying by 2029, which is the main route by which he could win the bet. I also think there's about a 2% chance that AI progress completely halts or reverses due to nuclear war or something. So a 6% chance of winning is reasonable.

Bryan told me over email that he intends to take advantage of the 2029 deadline and wait as long as possible to resolve the bet. It's his choice, I guess.

https://betonit.substack.com/p/gpt-retakes-my-midterm-and-gets-an

predictedNO

@LachlanMunro probably won't pay out for a while but seems inevitable at this point

predictedYES

Midterms that he puts online before the release of the AI will likely contaminate the training set. That’s probably what happened with GPT-4 on the one midterm they tested