When will there be an AGI for most tasks I care about?
24
1.1kṀ1204
resolved Nov 11
Resolved
YES
<2026
Resolved
YES
<2027
Resolved
YES
<2028
Resolved
YES
<2029
Resolved
YES
<2030
Resolved
YES
<2035
Resolved
NO
<2025

Resolves according to when I get an individual AI to do the majority (>5) of the below tasks. I will not put much effort into prompt engineering/elicitation; the spirit of this question is to get at when the below will be easy to do.

All answers after the year in question will resolve to YES.

I may trade on this market, but commit to having <50 mana invested in this market at any time by cost-to-me (I would like to record my beliefs, and am not particularly concerned with profit other than that). If requested, I will post the AI interactions which I used to resolve the question insofar as possible.

1. Pytorch (or Future DL Framework) Code Implementation

- Reliably implement Pytorch code upon request without user intervention, though scaffolding in the user interface is acceptable. At least 75% of the code must be ready-to-go.

- Implement (straightforward, but uncommon) code when given a library's documentation in context

2. Writing Improvement

- Anticipate and address critiques peers have, with at least 50% being useful.

- Propose clarifications that are usually accepted as changes by me.

3. Bullet Points to Text

- Write comprehensive content (blog post style) based on provided bullet points (50% success rate).

4. Learning Assistant

- Answer questions at the level a PhD-student TA would.

- Propose useful Anki cards, with the majority being accepted.

- Enhance Anki interactivity by rephrasing cards to cover the same topic slightly differently, aiming for a 90% acceptance rate.

5. Therapy Alternative

- Provide therapy consultations that are more useful than those from my current therapist.

6. Auto-Podcast Creation

- Distill economic and philosophy papers into podcast format for easy listening. Success is required 5 times, with the number of attempts depending on my willingness, which in turn depends on the LLM's quality.

7. Peer Review

- Offer feedback that surpasses the utility of critiques typically given by PhD students, especially in identifying missing experimental evidence or motivation, and possible graphic improvements.

8. Personal Activity Suggestions

- Suggest new activities (e.g., meditation techniques, exercise routines, movies) on a monthly basis that I actually engage in.

- Provide reliable book recommendations based on personal descriptions and additional information gathered, aiming for a higher success rate than other services.

9. Brainstorming/Debate Partner

- Serve as the go-to option for brainstorming research ideas, composing cover letters, iterating on ideas/arguments etc..

10. Creative Writing Generator

- Generate complete short stories from an outline or the first page, with the quality being high enough for me to willingly read the whole thing.

The sub-points are somewhat flexible, I'll try to stick with them but if it becomes clear that a model is very useful (for me) at the 'headline' task, I'll consider that sufficient.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ110
2Ṁ99
3Ṁ86
4Ṁ37
5Ṁ25
Sort by:

I've used LLMs recently for music recs, and creative writing. I've also used AI to enhance my physio's routine--so the activity advice is also filled. There's also a new GPT-based service that does comprehensive peer review that I suspect is better in many cases than a random same-department student's review. I also like bullet-point to text assistance. Other bullets are also variously fulfilled. So this resolve to this year.

For completeness, I'd note that podcasting probably isn't at a good enough level for me to listen to. The writing assistant thing is borderline, the way I wrote that criterion was quite strong.

I will carefully evaluate this before the end of the year but may not get to this immediately. Haven’t evaluated carefully enough to be sure but I’d guess Gemini 2.5 pro exp already suffices.

It looks like this would be an iterative improvement over ChatGPT, wouldn't it?

Since the resolution is somewhat subjective, could you write how far in your opinion is ChatGPT 4 from fulfilling these criteria?

@OlegEterevsky Hard to say you can take a look at my bets for my opinion. iterative in the sense that I could imagine a scaled up gpt with similar training qualifying, sure.

Numbers 1 code gen and 5 therapy are the only two where I currently find gpt useful out of this list. Gpt4 does not meet the bar for competency as described in this question even in those domains though.

bought Ṁ10 YES

Excellent write-up of desired AGI skills!

© Manifold Markets, Inc.TermsPrivacy