Will Llama-4 be (open sourced and) as good as GPT-4?

1kṀ16k

resolved Apr 5

Resolved

YES

ALL

This will be based on whatever Meta calls Llama-4, whether or not it deserves that name, or if it renames its next larger LLM to not include 'llama' I will use best judgment on whether it counts. If Meta does not release a relevant model by EOY 2025 this resolves to NO. If the model is not open sourced, it does not count.

By default will judge based on the leaderboard here: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard

Chatbot Arena Leaderboard - a Hugging Face Space by lmsys

Discover amazing ML apps made by the community

Clarification: This will compare to GPT-4 versions that existed at market creation. At this point, this is 99% a market on whether Llama-4 will exist and be an open model, I would be super surprised if it wasn't good enough on Arena.

Once it has been on the leaderboard for 7 days if it is close to allow ratings to settle, or if the resolution is obvious in either direction for any reason, I will resolve. If I feel the leaderboard is clearly wrong or it is not available at the time and the answer is non-obvious, I will consult experts and/or use a Twitter poll.

OpenAI

Technical AI Timelines

GPT-4 speculation

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ496
2		Ṁ360
3		Ṁ339
4		Ṁ277
5		Ṁ171

People are also trading

When will an open-source LLM be released with a better performance than GPT-4?

Will OpenAI's autonomous agent be based on GPT-4?

Sort by:

bought Ṁ3,000 YES

they've openweighted llama 4 maverick and it seems clearly better than gpt4 at market creation, but they probably won't openweight llama 4 behemoth (the 2T model) which will likely be way better than gpt4 at market creation. I'm personally guessing that the fact that maverick is better and open weights counts for a YES resolution but i'm not sure

@Bayesian We still have to wait 7 days, but free money at this point

Currently, there are multiple GTP4s being ranked with elo in the arena, which are we comparing to Llama 4? :

• ChatGPT-4o-latest (2024-09-03)
• GPT-4o-2024-05-13
• GPT-4o-mini-2024-07-18

[1] Are future GPT-4 models included in the comparison or just one of the existing ones being ranked?

[2] Will you compare highest GPT-4 elo against the highest Llama elo, or lowest against lowest, or lowest GPT-4 against highest Llama 4?

[2] Please specify, and, is there a tie-breaker in the rare case the models were tied in elo?

Thank you, and please add these instructions to the market to clear any confusion.

@nixtoshi I have made this very clear now.

(And not that it is going to happen, but 'as good' means it only has to tie the Elo number)

How is this only 55? Llama 3 405B should be GPT-4 level Llama 4 should obviously be much better

Because it has to be better AND open-source.

Which GPT-4? The GPT-4 that's serving now is miles ahead on all of the benchmarks compared to what was originally released.

@jonsimon See note below.

Note the dispute in the Llama-3 market. I will use whatever is decided there here, as well. Which means that this is now effectively 'Will Llama-4 be open sourced?'

Do you consider current LLAMA2 to be "open sourced" even though it contains a non-commercial clause?

@AdamTreat https://github.com/facebookresearch/llama/blob/main/LICENSE

"2. Additional Commercial Terms. If, on the Llama 2 version release date, the

monthly active users of the products or services made available by or for Licensee,

or Licensee's affiliates, is greater than 700 million monthly active users in the

preceding calendar month, you must request a license from Meta, which Meta may

grant to you in its sole discretion, and you are not authorized to exercise any of the

rights under this Agreement unless or until Meta otherwise expressly grants you

such rights."

@AdamTreat Yes. If I can get the weights it counts.

@ZviMowshowitz ok, then the question is do you have (or have any expectation of having) greater than 700 million monthly active users ;)

People are also trading

When will an open-source LLM be released with a better performance than GPT-4?

Will OpenAI's autonomous agent be based on GPT-4?

34% chance

🏅 Top traders

People are also trading

People are also trading

Related questions