GPT-5 plus scaffolding and inference-compute ~= training compute will achieve capabilities advance >= (GPT-4 to GPT-5).

Question written out without the abbreviations for clarity:

GPT-5, if given scaffolding and inference-compute that is approximately equal to its training compute will achieve a capabilities advance of similar or greater magnitude than the capabilities advance from GPT-4 to GPT-5.

Important! This question is assuming that the capabilities increase from GPT-4 to GPT-5 is at least as large as the increase from GPT-3 to GPT-4. If it is widely agreed that the capabilities increase from GPT-4 to GPT-5 is significantly smaller (e.g. because LLM scaling hits a ceiling), then the question will resolve N/A.

This question is related to, but different from, my other question here:

The discussion in the comments section on that question will give you more insight into my thinking, if that's something you want.

Get Ṁ600 play money
Sort by:

Very difficult to read but i think it mean

Base_delta = "base gpt-5" - "base gpt-4"

Scaffold_bonus = "gpt-5 scaffolded" - "base gpt-4"

bool market_outcome = (scaffold_bonus > Base_delta)

So I guess this is trying to compare intelligence improvements vs tool use? (Though a smarter model should recognize whenever a tool is a more effective option)

Also tool use should be fully integrated.

Gpt-5 may be natively multimodal and have python interpreter access and reference material access at all times in training. I assume if there is no way to benchmark the model without scaffolding the market resolves N/A?

@NathanHelmBurger interesting. Note that if this works as well as the paper claims you can bake it into the model itself during the RL phase. Scaffolding is all internal, model weights adjusted to effectively use this tool.

This would express itself as 0 improvement using this method on gpt-5, since the model is already doing something similar.

@GeraldMonroe Yes, I agree that the more powerful way to use this scaffolding is to apply it in the RL phase. I expect that things like this (probably including the ideas in the discussed paper) will be included in the RL phase for GPT-5. Which would mean that this exact scaffolding might not show itself to be of much help on top of GPT-5.

Nevertheless, I think that there will be NEW scaffolding which is devised in the future which will be of use on top of GPT-5. Thus, my heavy betting on YES.

For instance, an API by which an LLM could run relatively complicated ML experiments and receive nicely formatted data back once the experiment completed. This is something which I don't think anyone has published about yet, but I do expect will be tried by at least one of the frontier labs.

predicts NO

Can you somehow put the conditional in the title? I missed it at first from not reading full description

predicts YES

@JacobJacob sorry, already ran into max question length as is. Hopefully careful interactors will read the description or see your comment here.