@mods OP's account has been deleted, but AFAICT, a number of open-source models have met this threshold.
Based on this leaderboard, Llama 3.1 405B, Hunyuan Large, and Leeroo all surpass GPT-4 on this benchmark, and Llama 3.1 70B
is closely comparable.
@MugaSofer I used your link and attempted to decipher the criteria compared against the leaderboard. I agree with your explanation. Resolving Yes.
@MugaSofer Actually, hold on, can you show me better evidence that at least one of these is "open-source"?
@Eliza Sure.
Llama 3.1:
Llama 3.1 405B—the first frontier-level open source AI model [...] Until today, open source large language models have mostly trailed behind their closed counterparts when it comes to capabilities and performance. Now, we’re ushering in a new era with open source leading the way. We’re publicly releasing Meta Llama 3.1 405B, which we believe is the world’s largest [...] True to our commitment to open source, starting today, we’re making these models available to the community for download on llama.meta.com and Hugging Face and available for immediate development - Introducing Llama 3.1: Our most capable models to date
Hunyuan Large:
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model [...] The code and checkpoints of Hunyuan-Large are released to facilitate future innovations and applications. - Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Leeroo:
Edit:
I hadn't realised this at the time, but I think the version of Leeroo on that leaderboard was making calls to GPT-4(!), so it probably shouldn't qualify. The more fully open source version doesn't seem to pass the specified bar. (That's academic though, since the other two definitely do.) For more info, anyone interested can see The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration" and Orchestration of Experts: The First-Principle Multi-Model System.
@MugaSofer alright, as long as we aren't going to get "well ackchyually"'d by some users claiming these don't qualify as open source, I'm content to resolve Yes based on the evidence shown in this thread.
I did notice the creator said this version of "open source" was probably enough, down in a lower comment here, so it seems pretty safe.
technically this has already been done (through clear data contamination) Should I assume this only resolves yes if there is no evidence of data contamination? Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org
@DanielKokotajlo
I’ll count a model as open source if the model weights are accessible by people outside the organization.
Llama was originally released for researchers, and I would count this as open source for the purposes of this question.
If hackers put it on torrent, that’s open source too.
I realize this deviates from the definition of open source used in OSS communities. The spirit of the question is focused on malicious use and proliferation potential.
@mattt OK, thanks for the clarification. In that case this question is pretty much equivalent to mine I think: GPT4 or better model available for download by EOY 2024? | Manifold