Will we have a popular LLM fine-tuned on people's personal texts by June 1, 2024?

Ṁ1.1kṀ8.1k

resolved Aug 31

Resolved

ALL

Popular for this market= 10mil+ users

Personal texts = dms, private chats, WhatsApp chats, discord dms, lw dms, etc.

Market context

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ498
2		Ṁ108
3		Ṁ23
4		Ṁ15
5		Ṁ13

People are also trading

Will LLMs Daydream by EOY 2026?

24% chance

By 2027, will it be generally agreed upon that LLM produced text > human text for training LLMs?

62% chance

At the beginning of 2028, will LLMs still make egregious common-sensical errors?

67% chance

Will the highest-scoring LLM on Dec 31, 2026 show <10% improvement over 2025's best average benchmark performance?

83% chance

Will LLMs' loss function achieve the level of entropy of human text by the end of 2030?

61% chance

By 2029 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

77% chance

Will there be a state-of-the-art LLM that is NOT based on next raw token prediction before 2029?

67% chance

In 2028, will LLMs still be able to get Gary Marcus to make egregious errors?

92% chance

Will YouTube Comments make it into a major LLM by EOY 2027?

62% chance

Can an LLM be funny in 2026?

47% chance

Sort by:

I know of no such thing, and google doesn't turn up anything in a cursory search. Modresolving NO.

bought Ṁ320 NO

@metacontrarian Github Copilot only has 1.3 million users, 10 million people seems steep in comparison.

Fine-tuning on personal text is still very cool and doable, but I think the user adoption requirement sinks this.

if you wanted to do this right now, how would you do that?

How does this resolve?

Utility-wise, this has been true for some time already, as there's ample examples of davinci/turbo utilizing private data as a knowledge base by means of embeddings, often using libraries like GPT-index and langchain that fit this exact purpose very well.

If the exact criteria is fine-tuning, then that has also happened since OpenAI has exposed a fine-tuning API for their non-chat models for a while, which i would argue in the vast majority of cases is being used for private data.

If the market demands a specific product connecting to APIs for Slack/Whatsapp/Discord/Emails etc., AND demands fine-tuning, then i think the probability is close to 0% considering the ineffectiveness of that approach compared to context-length increases combined with embeddings.

@minosu Re: Para 1 -> "fine-tuned on people's personal texts"

Re: Para 2 -> It has to be fine tuned on people's personal data and the application doing that is popular - which i dont think has happened.