Will LLMs' non-language capabilities be used commercially by the end of 2023?

975Ṁ23k

resolved Jan 6

Resolved

YES

ALL

Copywriting, translating, conversing, semantic search, data labelling, text summarisation, and code generation are all established uses of LLMs that are being commercialised. I am interested in whether large language models (LLMs) — which are also referred to as foundation models — will have their capabilities extended beyond the natural language domain, in a reliable and commercially sustainable way. Will I believe this has happened by the end of 2023?

This resolves positively if the nature of its use is extended by being given an interface with other software, but there has to be evidence of it being used (sustainably) for real commercial applications. A simple demonstration will not suffice, nor does it count if its usage is for experimental/R&D purposes (as opposed to direct commercial ones).

If there is software that can design architectural models, or perform accounting tasks, and is fundamentally based on an LLM — that counts. The LLM cannot just be an added feature that does some natural language tasks on the side, it must actually drive the software's core functionality.

A lot rests on what I consider to be "notably different" in nature, but if you give me examples of new capabilities, I'll let you know what I think. Since this is a subjective market, I will not bet in it.

Jan 7, 1:54pm: ~~Will I believe that the nature of commercial LLM usage is notably different by the end of 2023?~~ → Will non-experimental, LLM-based commercial software have new, reliable, non-language capabilities by the end of 2023?

Apr 27, 11:39am: ~~Will non-experimental, LLM-based commercial software have new, reliable, non-language capabilities by the end of 2023?~~ → Will LLMs' non-language capabilities be used commercially the end of 2023?

Apr 27, 11:47am: ~~Will LLMs' non-language capabilities be used commercially the end of 2023?~~ → Will LLMs' non-language capabilities be used commercially by the end of 2023?

Technology

ChatGPT

GPT-4 speculation

New Year's Resolutions 2024

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ1,668
2		Ṁ135
3		Ṁ30
4		Ṁ12
5		Ṁ11

People are also trading

🧠 Which LLM will have the most real-world commercial usage by the end of 2025?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

14% chance

Will Google cancel an LLM-based product by end of 2025?

8% chance

Will RL work for LLMs "spill over" to the rest of RL by 2026?

34% chance

Will flagship LLMs begin integrating advertisements into their responses before 2026?

19% chance

Will there be major breakthrough in LLM Continual Learning before 2026?

14% chance

Will LLMs mostly overcome the Reversal Curse by the end of 2025?

54% chance

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

8% chance

Will LLMs be better than typical white-collar workers on all computer tasks before 2026?

4% chance

What will Manifolders mostly use LLMs for, by EOY 2025?

55 Comments

42 Holders

240 Trades

Sort by:

I will resolve this positively unless someone can convince me otherwise in the next couple of days; will set a reminder to log in on Friday or Saturday to resolve it.

predictedYES

clearly copilot qualifies, and then there's azure ai, sweep.dev etc

@firstuserhere These all look like they're using LLMs' 'language' capabilities (code generation was explicitly stated to be a 'language' capability in the market description). LLMs running its own code interpreter to do 'non-langauge' tasks would count as a change in its nature, but code generation itself isn't of a different nature. (Plus, it has to be used in a commercially sustainable way).

I feel like there ought to be some demonstration of ChatGPT being used commercially as a replacement for software that isn't at all a 'language' task. e.g. as part of a video transcription workflow, or being the default tool for generating the data viz for paywalled articles (but consistently, not as a one-off). Will have a look once I get the chance.

predictedYES

@finn Ok sure, what about GPT-4 vision?

is commercial
is LLM based
does non-language tasks

@firstuserhere I'm looking for it to be used commercially (as opposed to the LLM itself being commercial), so not going to resolve based off of the existence of GPT-4 vision itself (but there's a reasonable chance there's evidence of it being used for commercial non-language tasks).

predictedYES

@finn I am referring to ChatGPT being the commercial use of the LLM.

It is an application that is powered by this LLM, and this LLM's non language abilities are being used. People pay $20 a month to use this application.

@firstuserhere from an earlier comment I made:

I'm not going to resolve it yet because I want to see "evidence of it being used (sustainably) for real commercial applications". I realise ChatGPT is a commercial product on its own, but the intention is to resolve this only if the non-language capabilities LLM-based software are commercially viable.

In my head this was clearer that ChatGPT wouldn't itself count, but in retrospect I didn't clearly say that. And anyway, I do think the bar has been cleared, since plenty of people will have bought a ChatGPT Plus subscription to be able to use its data analysis, and if some other company made that as a standalone software based on an LLM it would be hard to dispute. Will leave a top-level comment in case a NO bettor wants to argue their case before I resolve it.

Control/interact with local apps using chatGPT interface: ChatPC

predictedYES

@finnhambly shouldn't this be resolved now?

@firstuserhere have you got any new examples that very clearly fit the criteria? There's probably something out there but I haven't gone hunting for anything yet (I was going to bank on a clear-as-day example showing up so that there's very little ambiguity when I do resolve this market).

I'm erring on the side of waiting, because it usually takes time for these tools to be integrated in a way that sustainably generates revenue

@finnhambly If a company implemented GPT based explanations of alerts in their commercial software (using per-engineered prompts to select reliable sources and get useful output), would that qualify as "non-language"?

https://risky.biz/RB703/ - Minute 42 for the interview
(note - I have no position in this market)

@JustNo interesting, and cool to hear about how this stuff is getting integrated. It's being used to explain a cyber security problem to a user, which is very much a language task IMO, but thank you for sharing!

doesn't this https://www.bemyeyes.com/ resolve it already?

predictedYES

their business section is clearly commercial

@jacksonpolack thanks, I've realised the title of this question is slightly misleading given the criteria - I've changed the title now (it's easier to phrase now that GPT4 is multimodal and has plugins)

Be My Eyes is indeed commercial, but I'm wanting to know if the tool they're selling is actually used by other companies to generate revenue (sorry that this wasn't clearer). If there's evidence that it's used for commercial tour guides (or something like that) I could see Be My Eyes resolving this positively.

@jacksonpolack sorry, I realise I didn't really reply to your question properly! You're right about the business section, but that's currently all focused on Be My Eyes' current approach of using human volunteers to help people.

I think my last comment was just a bit dumb by moving the goalposts in an unclear way; businesses regularly paying to use Be My Eyes' virtual assistant should definitely count for resolving this market! There just needs to be some evidence of it — I don't know if any of their business partners use the virtual assistant in place of human volunteers yet.

predictedYES

@finnhambly The GPT4-V(ision) system card says that the product was "Be my AI" and was indeed included in Be my eyes and used commercially. There's an option for calling a human operator to ensure that whatever the model is saying is correct.

@firstuserhere thanks — it talks about the beta testing group, but I can't see anything to say it's been successfully deployed in its commercial products yet, so wouldn't be ready to resolve this positively until then.

Let me know if there's anything else you've spotted. It's the evidence of the step from non-language capability (eg the advanced data analysis mode, or vision) to consistently generating revenue that's proving hardest to find evidence for.

predictedYES

@finnhambly I was replying to your comment above about Bee My eyes using GPT-4 Vision, and not presenting evidence for a resolution (of which I believe there are 100s, but I'd rather not do free labor unless I have to, and I'm happy to take a 7% return till then). Manifold doesn't have a clear way to showing which particular comment in a chain is being replied to.

@firstuserhere thanks, keep these coming. Most of these are examples of code generation, but the Genmo chat one and the browser automation one would count (I think) if used commercially.

@finnhambly would you resolve this YES now, based on so much stuff that's been seen

@Dreamingpast if you think there's any evidence of commercial usage that would resolve this positively then please do link to it here!

I've not seen any concrete evidence yet, but I've not been searching for it — my comments below discuss what it'll take for ChatGPT to resolve this positively :)

@finnhambly what about the huggingGPT shared in the comment below by @firstuserhere

predictedYES

@Dreamingpast https://huggingface.co/spaces/microsoft/HuggingGPT is the release!!!

predictedYES

@Dreamingpast it was all about when the release will happen! Now it has, and a LLM can steer 400+ models! and learn to do task plan! and use audio models and math models and speech to speech models and image to speech and video so coool

predictedYES

@Dreamingpast @finnhambly https://huggingface.co/spaces/microsoft/HuggingGPT

@Dreamingpast I just need evidence of a user using such capabilities as part of revenue generating activities (that aren't experimental/temporary).

I think I could have done better in specifying the focus of this market, but it's meant to be on the nature of its actual usage, rather than its demonstrated capabilities.