3
Will I (Peter Wildeford) think that there is an open source LLM as good as GPT3.5 by EOY 2023?
42
closes Dec 31
80%
chance

On Twitter, @StephenLCasper predicted:

"Like what happened with DALLE2 and Stable Diffusion, I predict that within a few months, a ChatGPT copycat model will be open sourced. And then all of OpenAIs work to make their model safe will be negated by the copycats they directly enabled."

I'm personally skpetical of this prediction.


In this question, I will use my subjective judgement to decide if there has been an open source LLM as good as GPT3.5 by the end of this year.

Grading the quality of an LLM is difficult. I'm planning to evaluate this in large part based on whether I can access this LLM and find it to be subjectively about is good, but I will also be interested in appealing to moderately standard benchmarks like MMLU.

Whether something is "open source" is defined liberally here and also will be determined by my subjective judgement, but generally I will deem something open source if (a) anyone can access it and (b) it wasn't the result of an unintentional leak/exfiltration, regardless of the precisions of the license.

I will rely on my subjective judgement to evaluate the credibility of cases. In the case this question is to resolve, I will allow 48 hours of discussion before resolving.

I will not personally be trading on this market because it relies on my subjective judgement.

Sort by:
EliLifland avatar
Eli Liflandis predicting YES at 77%

https://arxiv.org/abs/2305.15717

The False Promise of Imitating Proprietary LLMs
The False Promise of Imitating Proprietary LLMs
An emerging method to cheaply improve a weaker language model is to finetune it on outputs from a stronger model, such as a proprietary system like ChatGPT (e.g., Alpaca, Self-Instruct, and others). This approach looks to cheaply imitate the proprietary model’s capabilities using a weaker open-sourc…
ShadowyZephyr avatar
ShadowyZephyris predicting YES at 80% (edited)

https://twitter.com/Tim_Dettmers/status/1661379354507476994
This uses vicuna benchmark so it's probably actually as good

Edit: Why so many no betters? the 65b version significantly outscored ChatGPT

ShadowyZephyr avatar
ShadowyZephyris predicting YES at 81%

@ShadowyZephyr They even have a game where you can compare the prompts of guanaco-65b to ChatGPT, and they are good! So, I already can’t see the argument for no

Gigacasting avatar
Gigacasting
EliLifland avatar
Eli Liflandis predicting YES at 56%

https://sambanova.ai/blog/introducing-bloomchat-176b-the-multilingual-chat-based-llm/ claims to be close to GPT-4. https://twitter.com/Tim_Dettmers/status/1661379354507476994 claims to be 99.3% of performance level of ChatGPT.

Introducing BLOOMChat 176B - The Multilingual Chat based LLM
Introducing BLOOMChat 176B - The Multilingual Chat based LLM
SambaNova and Together are excited to announce the public release of BLOOMChat, a 176 Billion parameter multilingual large language model (LLM) optimized for chat applications. This is the largest chat-aligned open source model and the first one that is built on top of a multilingual pre-trained LLM…
ShadowyZephyr avatar
ShadowyZephyris predicting YES at 80%

@EliLifland Testing the demo, it doesn't seem that good

PeterWildeford avatar
Peter Wildeford (edited)

I just want to warn market participants that there are a lot of different possible metrics to measure the quality of an open source model vs. GPT3.5 but I will be particularly concerned about and focused on metrics that get at GPT3.5 having advanced capabilities that would be relevant to acting autonomously in the world in harmful ways (e.g., high social reasoning, strong computer programming).

ShadowyZephyr avatar
ShadowyZephyris predicting YES at 58%

@PeterWildeford It says gpt3.5 how is gpt4 relevant

PeterWildeford avatar
Peter Wildeford (edited)

@ShadowyZephyr Sorry that was a typo on my part. I meant ChatGPT (GPT3.5) - edited now to fix

ShadowyZephyr avatar
ShadowyZephyrbought Ṁ40 of YES
ampdot avatar
ampdot

Do you mean ChatGPT-3.5?

PeterWildeford avatar
Peter Wildeford
Gigacasting avatar
Gigacasting (edited)

Related markets

There will be an open source LLM approximately as good or better than GPT4 before 202572%
Will a LLM considerably more powerful than GPT-4 come out in 2023?23%
China will make a LLM approximately as good or better than GPT4 before 202540%
Will Open Source LLM's Beat Out The Vast Majority of Google and/or OpenAI/Microsofts's Moat by end of June 2024?35%
Will Google have a better LLM than OpenAI by 2025?41%
Will MPT be the best open LLM? (2023)27%
Will the entirety of Quora be incorporated into a LLM like Claude or GPT by the end of 2024?38%
Will LLMs' non-language capabilities be used commercially by the end of 2023?90%
Will there be an LLM which can do fluent conlang translations by EOY 2024?30%
Will a major technology company publicly admit to using a LLM for important decision making before 2025?23%
Will someone release a crypto-LLM by 2025?78%
Will someone release a crypto-LLM in 2023?40%
Will any LLM have roughly GPT-3-level losses with a context window of at least 50,000 tokens before April of 2024?36%
Will LLM Detection Get Better By The End of 2023?47%
Will an LLM have been reported to earn or gain cryptocurrency by EOY 2023?70%
Will I start using a non-LLM AI tool on a daily basis before 2025?78%
In 2023, will there be a "trough of disillusionment" regarding LLMs?13%
Will LLMs be better than typical white-collar workers on all computer tasks before 2026?42%
Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?57%
Will any company form a defensible moat around LLM-based AI before 2025?22%