To be judged as an LLM made by a Chinese organization that ties or surpasses the leading LLM by OpenAI, Anthropic, or Google on the leaderboard here.
Additional Information
China is significantly investing in the development of large language models (LLMs) and is home to many AI-oriented companies and LLM applications. Chinese tech giants like Baidu, Alibaba, Tencent, and SenseTime have already released their GAI products, and the Chinese government aims to be an AI leader by the 2030s. While there isn't direct information indicating whether China will surpass or match OpenAI, Anthropic, and Google by the end of 2024, given the substantial investments and efforts, it's plausible that China will remain competitive in the LLM race.
Concurrently, the progress and development of OpenAI, Anthropic, and Google in the LLM race have been rapid. OpenAI led the initial LLM boom with its GPT-3 model, Anthropic is focusing on making LLMs more transparent, safe, and beneficial, and Google's Pathways AI model has surpassed GPT-3 in terms of parameters. This suggests that the race in LLM development continues to be highly competitive.
Some Background From The Web
Will China be competitive in the LLM race compared to OpenAI, Anthropic, and Google by end of 2024?
AISupremacy
TechWireAsia
What are China's current advancements and investments in the field of LLM?
Shanghaiist
TechWireAsia
What is the pace of progress and development of OpenAI, Anthropic, and Google in the LLM race?
LinkedIn
Medium
yet another lesson on why to bet small unless you really understand the scoring method - thankfully I only bet 10 on this
there's that bad gpt-4 release that every model beats (for contests about whether X has a model that surpasses an OpenAI model)
and here - there's somehow a world in which the Yi Lightning model beats the GOAT claude-3.5-sonnet
pretty much everyone is willing to pay a premium for claude-3.5-sonnet lol
I'm confused. "To be judged as an LLM made by a Chinese organization that ties or surpasses the leading LLM by OpenAI, Anthropic, or Google on the leaderboard here."
In the leaderboard, the top 4 are all by OpenAI or google. Why was this resolved as yes?
@LeonLang If the task was just to surpass the best model of any of those three companies, I really think the resolution criteria should have been stated MUCH clearer.
@jacksonpolack I think even based on the description and the author's comments (before his most recent one), the fact that he intended this resolution was unclear, so I don't know if changing the title would have fixed people being unhappy with this resolution.
Not upset by the outcome, however, China is not competitive against those 3. Few to no business or developer is switching from one of those 3 providers. While they may have comparable model performance, they are not competitive. In the same way that Xai is also not competitive where simply building a performant model is not enough.
God fucking dammit i gotta stop betting on these markets.
Title: Will <entity> have the absolute best LLM??
Description: <entity> must have an llm that beats GPT 2 running on a potato
The creator has stated "it would have to be better than the best performing LLM from any of the above mentioned, but not all"
Resolves yes. Yi is now above the best Anthropic model.
@nikki I didn't see as soon as. It's possible the next Claude release by end of year may have surpassed 01ai again? how should we resolve that case? From description first paragraph, should still be a NO?
oh i see @Ledger commented before. Now I'm confused because I invested based on description didn't read old comments...
I also ended up mistakenly delaying selling my NO shares since I didn't read through the comments to find the extra 'as soon as' condition that's not in the description and since I was thus trying to condition on P[Claude 3.5 Opus release by EOY] (because Yi-Lightning debuted above* Claude 3.5 Sonnet on lmarena.ai). Oh well.
* Edit: Note that this is not the case for the lmarenai.ai Style Control-weighted ranking (but this is underspecified in any resolution criteria information).
Yep, looks like this resolves to YES.
FWIW to those that interpreted the resolution criteria as ON 12/31 rather than BY 12/31, sorry. It says BY end of year in the title, and there is further confirmation in the comments. But, of course the mods can resolve this however they see fit.
@nikki FWIW I saw both those comments but interpreted them both differently.
With regards to the second statement, the plain meaning of the "first appearance" is the one at the top of the list. Does Yi-Lightning show up above the "first appearance" a model by OpenAI, Google, and Anthropic? Resolving now only makes sense if you interpret "first appearance" in a very specific way "looking from the bottom, but ignoring models that are not the best model from that company".
The first statement is more ambiguous but I think the plain meaning also goes against the resolution. The "best performing model from any of the above mentioned [OpenAI, Google, Antrhopic]" is o1-preview, which Yi-lightning is not better than. What the author should have said was "the best performing model from each of the above mentioned" or "the best performing models from any of the above mentioned". I interpreted the "not all" part of the comment to be referring to LLMs not from those three companies.
I thought there was some chance I was wrong, which is why I didn't bet below 33%. I wish the creator had been more clear; Chris Billington pointed out the ambiguity and asked specific questions below which he ignored. But I knew the risk, the main purpose of the comment is to say I think interpretation impacted this market.
when i invested in NO I was thinking since companies in China have no access to latest gpu so it's physically impossible to grow domestic model. But since this 01AI company is well-known for distilling and fine tuning meta's open source model. Maybe it's easier to stay in the race as long as you are a country with large web service market I guess
1 China doesn't have database that US accumulated. Chinese web is simply not connected but severely divided by large companies.
2 chip supply is cut. Large companies even Baidu, alibaba don't have medium size gpu cluster.
3 01AI had a thousand gpu size cluster. Their work is well-known for built upon Meta's open source model.
So no other country can survive outside the US ecosystem. 01AI company's score is possible because llama is already there.
So technically 01AI is an exception. lack of GPU and lack of data is fatal
https://manifold.markets/JoshYou/when-will-claude-35-opus-be-release
Once 3.5 Opus is out, Anthropic will be ahead of Yi-01.AI.
This market is overpriced.
@greenglass Agree. If Opus surpassed 01ai, based on description of the this market, it should still be a NO, since no one ties or surpasses leading models of three company.
@greenglass so now it's narrowed down to whether Claude will came better than 01ai by end of year. It's most possible YES.