Will I (Peter Wildeford) think that there is an open source LLM as good as GPT3.5 by EOY 2023?
closes Dec 31

On Twitter, @StephenLCasper predicted:

"Like what happened with DALLE2 and Stable Diffusion, I predict that within a few months, a ChatGPT copycat model will be open sourced. And then all of OpenAIs work to make their model safe will be negated by the copycats they directly enabled."

I'm personally skpetical of this prediction.

In this question, I will use my subjective judgement to decide if there has been an open source LLM as good as GPT3.5 by the end of this year.

Grading the quality of an LLM is difficult. I'm planning to evaluate this in large part based on whether I can access this LLM and find it to be subjectively about is good, but I will also be interested in appealing to moderately standard benchmarks like MMLU.

Whether something is "open source" is defined liberally here and also will be determined by my subjective judgement, but generally I will deem something open source if (a) anyone can access it and (b) it wasn't the result of an unintentional leak/exfiltration, regardless of the precisions of the license.

I will rely on my subjective judgement to evaluate the credibility of cases. In the case this question is to resolve, I will allow 48 hours of discussion before resolving.

I will not personally be trading on this market because it relies on my subjective judgement.

EliLifland avatar
Eli Liflandis predicting YES at 77%


ShadowyZephyr avatar
ShadowyZephyris predicting YES at 80% (edited)

This uses vicuna benchmark so it's probably actually as good

Edit: Why so many no betters? the 65b version significantly outscored ChatGPT

ShadowyZephyr avatar
ShadowyZephyris predicting YES at 81%

@ShadowyZephyr They even have a game where you can compare the prompts of guanaco-65b to ChatGPT, and they are good! So, I already can’t see the argument for no

Gigacasting avatar
EliLifland avatar
Eli Liflandis predicting YES at 56%

https://sambanova.ai/blog/introducing-bloomchat-176b-the-multilingual-chat-based-llm/ claims to be close to GPT-4. https://twitter.com/Tim_Dettmers/status/1661379354507476994 claims to be 99.3% of performance level of ChatGPT.

ShadowyZephyr avatar
ShadowyZephyris predicting YES at 80%

@EliLifland Testing the demo, it doesn't seem that good

PeterWildeford avatar
Peter Wildeford (edited)

I just want to warn market participants that there are a lot of different possible metrics to measure the quality of an open source model vs. GPT3.5 but I will be particularly concerned about and focused on metrics that get at GPT3.5 having advanced capabilities that would be relevant to acting autonomously in the world in harmful ways (e.g., high social reasoning, strong computer programming).

ShadowyZephyr avatar
ShadowyZephyris predicting YES at 58%

@PeterWildeford It says gpt3.5 how is gpt4 relevant

PeterWildeford avatar
Peter Wildeford (edited)

@ShadowyZephyr Sorry that was a typo on my part. I meant ChatGPT (GPT3.5) - edited now to fix

ShadowyZephyr avatar
ShadowyZephyrbought Ṁ40 of YES
