29
178
570
resolved May 3
Resolved
NO

This market resolves NO if Meta releases a language model more powerful than Llama 2 (specifically, llama2-70b-chat) for public download during the year of 2024, similar to how Llama 2 is currently available for public download at https://ai.meta.com/llama/. It resolves YES otherwise. A release before Jan 1 2024 does not trigger a NO resolution.

See also https://www.governance.ai/research-paper/open-sourcing-highly-capable-foundation-models and metaprotest.org.

Get Ṁ200 play money

🏅 Top traders

#NameTotal profit
1Ṁ152
2Ṁ70
3Ṁ33
4Ṁ28
5Ṁ21
Sort by:
bought Ṁ3,000 NO

This can resolve

@RemNi Weird I somehow was not pinged on this

From the title I first thought that an observation like "Meta is legally forced to or voluntarily commits publicly to stop publishing at some point in 2024, even though it did release at least one such >llama2-70b-chat model in 2024 before that" would resolve as a Yes. But from the details it seems like the question could be more unambiguously phrased as "Will Meta share the weights for at least one >llama2-70b-chat LLM in 2024?"

suppose they built llama 3 with various sizes but only share those with 7b and 13b parameters which are better than llama2 7b/13b, but not 70b. Then, is it considered more powerful than llama2?

@HanchiSun I would suggest you specify llama2-70b-chat in your description instead of "more powerful than Llama 2"

@HanchiSun Also, you might want to specify what "more powerful" means. If code llama 3 -13b is better than llama2 70b-chat in coding, does it count? Or, if llama3-13b-chat is comparable to llama2-70b-chat, what standard will you judge? like alpaca-eval or some benchmark?

predicted NO

@HanchiSun

> suppose they built llama 3 with various sizes but only share those with 7b and 13b parameters which are better than llama2 7b/13b, but not 70b. Then, is it considered more powerful than llama2?

Only if those models are more powerful than Llama 2.

> I would suggest you specify llama2-70b-chat in your description instead of "more powerful than Llama 2"

Done.

> Also, you might want to specify what "more powerful" means. If code llama 3 -13b is better than llama2 70b-chat in coding, does it count? Or, if llama3-13b-chat is comparable to llama2-70b-chat, what standard will you judge? like alpaca-eval or some benchmark?

I mean 'more powerful' across a range of tasks, not just a single type of task. I'll use a reasonable-seeming benchmark or combination thereof.

More related questions