Will OpenAI release a GPT>=3 equivalent language model capable of web browsing in response to user queries before January 1st, 2024?
43
299
850
resolved Feb 9
Resolved
YES

For Resolution:


The model must be capable of browsing in response to user queries, not merely during the training phase.

To qualify as YES for a release, either:

A.) access should be provided via openai.com, or a sub-domain thereof, through a ChatGPT-like interface or by API access.

OR

B.) usage of the model should be clearly announced as licensed to a third-party (like the relationship between Github Copilot and OpenAI Codex), which in turn provides the model to users by the target date.

To be clear:


A model merely fine-tuned by an API end-user for this purpose does not count.

A publication indicating that OpenAI has developed this capability in-house, but which only shows a handful of selected examples does not count.

In the event the browsing capability is restricted to a fixed off-line cache of pre-archived webpages, I reserve the right to resolve this as N/A.

A version of this market with a shorter timeline is available here:

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ488
2Ṁ137
3Ṁ48
4Ṁ35
5Ṁ22
Sort by:

Market maker Edward Kmett buys M2,780 in the YES position right before resolving the market YES

predicted NO

I don't get why this resolved Yes?

Super sketchy market resolution with the market maker profiting handsomely...glad I sold out.

bought Ṁ100 of YES

Does an integration of Bing and GPT tech count as YES?

bought Ṁ2,780 of YES
bought Ṁ10 of NO

I'm doubling down on my feedback on this market and giving the same feedback as your other market, re: ... I notice a lot of people on Manifold don't like being specific, they would rather keep their market open because it costs money to create a market, which is understandable, so at the risk of being punished rather than rewarded for harsh feedback...the answer to your question is, "No," because that's not how GPT-3 works and that's not how language models work. A language model is trained on past data and has no new information. An application or wrapper utilizing a language model can be designed to browse the web, or make it look like it's browsing the web (using past data stored to give the illusion that it's browsing), but the language model itself does not, "browse." The language model, such as GPT3 is a set inference model that you call - all of the browsing/scraping prior to a callable inference model was done on a data set of browsed data prior to the model being trained and converted into an inference. Now, you could design something like a ChatGPT such that if you give it a command, "Open a browser and search for Planes, Trains and Automobiles," and then the ChatGPT-like app can either lie to you and say that it did that, when it actually just looked into its past database, or it can receive and interpret and assign a probability to your signal as a, "search" signal utilizing GPT3 or another language model, and then put your signal into a search browser, and feed you back the results, along with its answer, or use said results as an input to its answer...but the language model itself is not doing the browsing, at all.

bought Ṁ10 of YES

@PatrickDelaney I'm quite familiar with how language models work having spent a significant fraction of the last few years working on and with them.

There's nothing that particularly stops one from equipping one with simple super powers in response to it emitting a token stream of a given format.

A low-key example:

https://twitter.com/f_j_j_/status/1568453579605954560

A more directly relevant example:

WebGPT exists and is effectively just a LLM hooked up to something that replies to it when it issues one of several commands Search:, Find In Page: Quote: ...

https://openai.com/blog/webgpt/

by opening a page and spewing the content of that page back into the LLM as tokens, and then with enough training data getting it to package things up in a format that effectively leads to citations off the page.

The real question is if they'll have something where they package up this behavior and/or an agent fine-tuned to do this sort of citation in a ChatGPT-like context and expose it to users.

sold Ṁ11 of NO

@EdwardKmett All right well I just sold out in response to your comment, because clearly you are defining a language model as, "a language model strapped to an app." That's fine...a lot of people, in fact probably the majority of people paying attention to ChatGPT are conflating the two concepts, so that might actually be a fairer way to resolve the market based upon majority understanding. However, an LLM is different than an app - the blog you listed notes, "GPT-3" is what they refer to as the LLM. WebGPT is something else.

I put together a YouTube video trying to explain the intricacies. https://www.youtube.com/watch?v=whbNCSZb3c8

I appreciate that you are spending a lot of time on LLM's, that's great! It's a fascinating area.

So if you look closer at what that blog posts says:

fine-tuned GPT-3 to more accurately answer open-ended questions using a text-based web browser

So, "fine tuning" is what OpenAI calls, simply put, using text to modify a set of parameters in their overall LLM, GPT-X, to give the language a particular character. So you can have an LLM, and then you can fine-tune it to give different flavors of responses. Training on the other hand, is an incredibly expensive process, in the tens of millions of dollars estimated. Fine tuning is arguably a form of inference.

So what they have done is, strapped an LLM to a web browser, and fine tuned that LLM so it works. This is not the same as releasing a new LLM.

predicted NO

@PatrickDelaney

Fine tuning is arguably a form of inference.

I've never heard fine-tuning described that way in my life. Fine-tuning is still modifying the parameters of the network. The primary attribute distinguishing it from the "main" training phase, (e.g. next-word-prediction pretraining) is the scale on which it's done.

predicted YES

@PatrickDelaney A public release of WebGPT would not suffice. A fine-tuned or scratch trained model by OpenAI version of something in the GPT-3 or later generation trained to do the same sort of task, or to interact with the user ChatGPT style while providing cited references or suggested links that the augmented LLM reads or scrapes live in response to user queries would, however, so long as it met the extra caveat that the fine-tuning or training would have to be done by OpenAI to explicitly narrow the focus to them doing the work and/or packaging it up for a partner.

The reason for wanting this to be an OpenAI thing is that otherwise an end-user could just engineer a prompt, and fine-tune for a while and provide a limited PoC and call this done, and that isn't what I'm looking for.

The fact that ChatGPT's prompt seems to internally refer to itself as "Assistant" with "Browsing: disabled" seems to give indication that they likely have built up the capability in house.

@jonsimon You're right, I was oversimplifying, what you said is more correct.

bought Ṁ40 of NO

They could, but I don't think they will, since they'll see it as a liability risk.

Will OpenAI release a GPT>=3 equivalent language model capable of web browsing in response to user queries before January 1st, 2024?, 8k, beautiful, illustration, trending on art station, picture of the day, epic composition