Inspired by the following comment on lesswrong:
Are autoregressive LLMs a doomed paradigm?
YES = Autoregressive LLMs are a dead end. People will improve them as much as possible, but at some point they will stop becoming better, short of reaching a performance level considered human.
NO = They will continue to improve up to human performance.
Quibbles:
1) If autoregressive LLMs lead smoothly to some other paradigm which behaves overall as a different beast but can be clearly traced back with continuity to current LLMs, that counts as NO, e.g., if LLMs are used to stitch up agents like people are trying to do now.
2) The question is on autoregressive LLMs. A "diffusion" LLM (I don't know if that makes sense technically, I'm making it up to gesture at the idea) replacing autoregressive LLMs as approach because it scales better would count as YES.
2a) Quibbles (1) and (2) may conflict if there is an arguably continuous transition to something which is clearly not autoregressive in the end. In that case, (2) has priority, i.e., a continuous progress from autoregressive to non-autoregressive LLMs counts as YES.
3) "Human performance" may be ambiguous. I'd lean towards waiting to resolve the market until there's a general consensus.
4) More specific targets other than "human performance" may make sense, but I'd have to define a host of them, and they could turn out to be moot later. "Human performance" looks like the kind of thing that everyone in the end will be forced to confront and decide about.
Why is this such a hard spectrum with only two possible outcomes? OK, I don't mean to uncharitably read what your market is saying here, I'm more putting this argument out here for teh sake of discussion, so please take this with a grain of salt. What I'm seeing is, "Either LLM's will absolutely fail or they will end up being the Singularity."
Why can't there be successes for the current architecture of GPT-based LLM's in some domains, but then it just absolutely doesn't work in others? I definitely have heard LeCunn talk about this all the time, he is definitely attempting to build out the next AI for the next round of hardware a decade from now...yes, I get it. But why does his research invalidate LLMs for any application? We still use inferior technologies for all sorts of things -- fax machines are still used, for example.
On the other hand, we (humans) may never get to the singularity. There is a ton of hype, but it may not even be possible, and in particular with GPT's because we just don't know the limitations of the hallucination problem yet.
I'm not sure how to bet in this market. Again, not trying to knock you personally, I would just like more clarification. Would this ever resolve to a PROB?
@PatrickDelaney As you can probably infer, my expectations are that LLMs are likely enough to crack humans if pushed full steam ahead, since I created a market with this so extreme resolution criterion, "human performance". I see two main options for you:
1) You believe they can't reach human level: bet YES.
2) You believe they can reach human level: elicit your probability for another paradigm taking over, as defined in the quibbles, bet to that level.
@PatrickDelaney The point is avoiding specific benchmarks. I feel confident that at some point it will just be good at anything and the general consensus will be "yeah it's just as good as a human, as measured by humans' subjective opinion in dealing with it".