
Resolves according to when I get an individual AI to do the majority (>5) of the below tasks. I will not put much effort into prompt engineering/elicitation; the spirit of this question is to get at when the below will be easy to do.
All answers after the year in question will resolve to YES.
I may trade on this market, but commit to having <50 mana invested in this market at any time by cost-to-me (I would like to record my beliefs, and am not particularly concerned with profit other than that). If requested, I will post the AI interactions which I used to resolve the question insofar as possible.
1. Pytorch (or Future DL Framework) Code Implementation
- Reliably implement Pytorch code upon request without user intervention, though scaffolding in the user interface is acceptable. At least 75% of the code must be ready-to-go.
- Implement (straightforward, but uncommon) code when given a library's documentation in context
2. Writing Improvement
- Anticipate and address critiques peers have, with at least 50% being useful.
- Propose clarifications that are usually accepted as changes by me.
3. Bullet Points to Text
- Write comprehensive content (blog post style) based on provided bullet points (50% success rate).
4. Learning Assistant
- Answer questions at the level a PhD-student TA would.
- Propose useful Anki cards, with the majority being accepted.
- Enhance Anki interactivity by rephrasing cards to cover the same topic slightly differently, aiming for a 90% acceptance rate.
5. Therapy Alternative
- Provide therapy consultations that are more useful than those from my current therapist.
6. Auto-Podcast Creation
- Distill economic and philosophy papers into podcast format for easy listening. Success is required 5 times, with the number of attempts depending on my willingness, which in turn depends on the LLM's quality.
7. Peer Review
- Offer feedback that surpasses the utility of critiques typically given by PhD students, especially in identifying missing experimental evidence or motivation, and possible graphic improvements.
8. Personal Activity Suggestions
- Suggest new activities (e.g., meditation techniques, exercise routines, movies) on a monthly basis that I actually engage in.
- Provide reliable book recommendations based on personal descriptions and additional information gathered, aiming for a higher success rate than other services.
9. Brainstorming/Debate Partner
- Serve as the go-to option for brainstorming research ideas, composing cover letters, iterating on ideas/arguments etc..
10. Creative Writing Generator
- Generate complete short stories from an outline or the first page, with the quality being high enough for me to willingly read the whole thing.
The sub-points are somewhat flexible, I'll try to stick with them but if it becomes clear that a model is very useful (for me) at the 'headline' task, I'll consider that sufficient.
🏅 Top traders
| # | Name | Total profit |
|---|---|---|
| 1 | Ṁ110 | |
| 2 | Ṁ99 | |
| 3 | Ṁ86 | |
| 4 | Ṁ37 | |
| 5 | Ṁ25 |
People are also trading
I've used LLMs recently for music recs, and creative writing. I've also used AI to enhance my physio's routine--so the activity advice is also filled. There's also a new GPT-based service that does comprehensive peer review that I suspect is better in many cases than a random same-department student's review. I also like bullet-point to text assistance. Other bullets are also variously fulfilled. So this resolve to this year.
For completeness, I'd note that podcasting probably isn't at a good enough level for me to listen to. The writing assistant thing is borderline, the way I wrote that criterion was quite strong.
@OlegEterevsky Hard to say you can take a look at my bets for my opinion. iterative in the sense that I could imagine a scaled up gpt with similar training qualifying, sure.
Numbers 1 code gen and 5 therapy are the only two where I currently find gpt useful out of this list. Gpt4 does not meet the bar for competency as described in this question even in those domains though.