Next year will I think that AI is better than me at math?
28
1kṀ2214
Dec 29
67%
chance

Within one year, will there be an AI that can solve any math problem I can (_including_ research math problems) for less money than it would cost to hire me or someone with a similar background as a consultant on the problem (let's say $250/hour).

In theory I should test this by handing it my grad school work and seeing how it does but that may be prohibitively expensive, so instead resolution will be based on my inscrutable whims / general vibes, so consider yourselves warned.

(For my level of math: this is my real name and you can look up my resume, but tl;dr I dropped out of a PhD in ML where about half of my time was spent on PAC learning bounds for causal discovery algorithms. I made it semi-far into the proofs but didn't publish, which is part of why the comparison will be vibes based. I also did okay on the Putnam but it's pretty likely that AI is already better than me at competition math so I don't think that's very relevant.)

  • Update 2024-21-12 (PST): The market will be resolved based on my assessment at market close time (one year from market creation). I will resolve Yes if I think AI is better than me at that time, and No otherwise. (AI summary of creator comment)

Get
Ṁ1,000
to start trading!


Sort by:
5mo

@VincentLuczkow how do you feel personally about this? Like on an emotional level?

bought Ṁ30 YES5mo

Yes I think so, provided you have access to o4-class models

5mo

I think it's likely someone will discover classes of problems that o3 at release seriously struggles with, like combinatorics or something

will there be an AI that can solve any math problem I can

I'm assuming "any" here means "all" rather than "at least one", otherwise a pocket calculator wins lol

If so, this may be near impossible with machine learning, because it can only learn to do stuff based on there being a bunch of it in its training data right, which may be impossible unless your level of research math becomes a common publicly posted passtime

bought Ṁ20 YES4mo

@TheAllMemeingEye It only needs to understand the concept, and it can reason through the problem after that.

There are a lot of things not in specific AI models' training data that they can still figure out with some effort.

@Haiku what would you say are some good examples of such things? My understanding is that absence from training data is why for example LLMs often struggle with ASCII art

4mo

@TheAllMemeingEye To your question, it depends on what constitutes "not in the training data" (i.e. how close of an example counts), but I think some good examples include:
- Explaining novel jokes
- Playing a simple novel game explained at task time
- Solving novel code challenges
- Solving novel math problems

At some point when you've seen enough, there almost isn't such a thing as novelty, since there's always something that is in some way similar. But that's a property of information, not a property of language models. Humans also usually can't solve types of problems that are extremely novel to them.

I think the ASCII art thing has more to do with the fact that LLMs see the world through 1 dimension, so it's difficult to construct representative 2D images with no practice (i.e. no post-training/RL on ASCII art output). That's roughly the same reason why the ARC benchmark took so long to beat. A model that can beat that benchmark the way it's forced to do it is much more intelligent (in that aspect) than a human. If you trained an LLM much more heavily on ASCII art, it would probably overcome the handicap and be able to produce new and compelling ASCII images despite how difficult it is to do so, because much more of its neural network would be dedicated to memorizing additional layers of useful algorithms for doing so. I think doing this task in 1D would be extremely difficult for most humans.

Intelligence/reasoning is a huge patchwork bundle of various useful algorithms. There are obvious holes in LLM reasoning that haven't been patched yet, but I haven't heard any compelling arguments for why they'll never be patched in that architecture.

I don't really have sources on most of the above, but I really liked this deep dive on whether LLMs can reason:
https://www.youtube.com/watch?v=wXGiV6tVtN0

@Haiku thanks for explaining 👍

Just to be clear, you will resolve no if that is the case around new year 2026

5mo

@JussiVilleHeiskanen What does "that" refer to? I will resolve yes if I think AI is better than me (at market close time, one year from market creation), and no otherwise.

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Or create your own play-money betting market on any question you care about.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.
ṀWhy use play money?
Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.
Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
© Manifold Markets, Inc.TermsPrivacy