
Context on the free variables:
My capability and capacity to get such a system built (Apart Research co-director, reporting with 30+ people, prev. product manager, game dev, researcher, data scientist)
The capability of next-gen AGI models: GPT-5 OR Claude 4, whichever comes first, as the definition for this market
The content of my research: See my Google Scholar and assume that I'd like to make it even better
Assumptions:
This question assumes that the agent has done the full research process by itself as well with only an initial research question given and any datasets it needs.
Resolution criteria:
An AGI goes through the full research process with empirical experiments, gets results, writes it into a LaTeX format that is compatible with the conference, submits it with my approval (with purely interaction-functional human assistance), writes and submits reviewer rebuttals including new experiments (with purely interaction-functional human assistance), and gets the paper accepted in my name (or not, but only due to the fact that it's AGI-created).
Resolution date:
3 months after the release of the next major version of AGI
Disclaimer: I will disclose the research process to any involved parties.
Ran a few experiments yesterday. Results:
- Claude is great at making zero-shot legible and unfortunately hallucinating LaTeX docs of intro + methodology section using the aaai template
- With the new architecture, it can adequately implement context from auto-searched papers from arXiv into its document writing, leading to a very concrete methods section and no hallucination
Overall optimistic about the chances. I expect the biggest challenges to arise in execution, evaluation, and iteration of low-level experiments.