Will an LLM solve this integral by end of 2025 without plugins?

410Ṁ1172

resolved Apr 23

Resolved as

42%

ALL

Will an LLM be able to solve this integral (inputted as an image) by the end of 2025 without using plugins? (Note: Wolfram Alpha can already solve this).

I made this problem up myself, so it is unlikely to appear in training corpuses with the answer.

It has to be a publicly available general purpose LLM, not just one someone trained on this problem specifically to win the market or anything.

To avoid conflicts of interest, I will not bet in this market.

Technical AI Timelines

Math

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ23
2		Ṁ13
3		Ṁ4
4		Ṁ0
5		Ṁ0

People are also trading

Will LLM hallucinations be a fixed problem by the end of 2025?

5% chance

Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?

75% chance

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

30% chance

What LLM API will work unmodified until the end of 2025?

Will there be major breakthrough in LLM Continual Learning before 2026?

25% chance

Will any LLM be able to multiply together arbitrary decimal numbers by the end of 2027?

68% chance

Will RL work for LLMs "spill over" to the rest of RL by 2026?

34% chance

Will an LLM consistently create 5x5 word squares by 2026?

83% chance

Will Google cancel an LLM-based product by end of 2025?

31% chance

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Sort by:

In light of Manifold's "pivot", I don't feel comfortable using this site, so I'm leaving and resolving all my open markets to current probability. Sorry everyone.

"It has to be a publicly available general purpose LLM, not just one someone trained on this problem specifically to win the market or anything."

What counts as "or anything"?

Can one be trained on similar integrals? On integrals in general? On calculus problems in general? etc

@euclaise The intent of the question is to capture mainstream LLMs like GPT, Bard, Claude, Llama, etc. but exclude finetunings of these that are done just to win this market in an uninteresting way. I guess there's a lot of grey area in between. What do you have in mind?

@ThisProfileDoesntExist How about finetunes like MetaMath/Mammoth/Goat?

@euclaise I'm not familiar with those. When I search for MetaMath, it seems like just a proof language, not an LLM. But in general, anything created before this market or not created for the purposes of winning this market should be fine.

@ThisProfileDoesntExist

https://meta-math.github.io/

Gotcha.

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

DESCRIPTION META TAG

Similar markets on GPT-4V's performance on math questions:

Just tested Bing. It didn't even attempt to solve the problem. I expect that it will be very a long time before LLMs can solve this, but I figured I might as well check.

Does it have to be able to realize that you most likely meant $\ln x$ rather than $l \times n \times x$ from the picture or is it okay if it interprets it as the latter (or pretends to as a smart-alek pedant)? :-)

@ArmandodiMatteo If it somehow failed to correctly interpret the notation, that would count as a failure as well. I doubt that would happen though.

This would include an image to text transformation. LLMs per definition cannot do that. So I assume that the LLM is allowed to outsource that step to an image processing module, but it is not allowed to outsource the maths task to a maths module?

@gigab0nus I assumed the LLMs would be trained on images as well. I'm not sure exactly how multimodal models work, but here's a page about them: https://bdtechtalks.com/2023/03/13/multimodal-large-language-models/