[SOFT CRITERIA, READ DESC] Will all "gpt2-chatbot" models in LMSYS prove to be new, improved models from OpenAI?

Question

IMPORTANT: Read full criteria. This market has a soft criteria. For a more strict criteria you have this market: [HARD CRITERIA, READ DESC] Will all "gpt2-chatbot" models in LMSYS prove to be new, improved models from OpenAI?

BE AWARE: For the pourpose of this market "gpt2-chatbot" means all models that induces belief that are based on or from the gpt2-chatbot original one (like having the string "gpt2-chatbot" in their names). This might include any statement from Sam Altman, OpenAI, or other reliable sources. If Sam Altman or OpenAI explicitly states that a certain model is not "gpt2-chatbot" or is a much improved version (like a gpt2-2-chatbot that is more akin to GPT-5 instead of the GPT-4/4.5 level of current gpt2-chatbot models), I will regard that model as not a "gpt2-chatbot" and not consider it for this market.

YES and NO criteria apply to all "gpt2-chatbot" models at the same time as if all were the same model. Thus, for N number of models regarded as "gpt2-chatbot", the resolution criteria will require all these N models to comply in order to resolve either YES or NO (or just resolve as NO past the deadline). That means all need to rank in the top 10, be confirmed by OpenAI, and have a higher ELO than gpt-4-turbo-2024-04-09 for a YES resolution. All need to be denied as an OpenAI model/claimed with evidence by another organization for a NO resolution.

(UTC) April 28: The original gpt2-chatbot, introduced just days earlier, is noticed by the community and gains attention.

(UTC) May 1 Update: gpt2-chatbot was removed from LMSYS.

(UTC) May 7 Update: There are two new "gpt2-chatbot" models in LMSYS (battle mode): im-a-good-gpt2-chatbot and im-also-a-good-gpt2-chatbot.

(UTC) May 16 Update: gpt-4o-2024-05-13 is now in the leaderboard with more ELO than gpt-4-turbo-2024-04-09. gpt-4o-2024-05-13 is confirmed to be gpt2-chatbot.

Current models regarded as "gpt2-chatbot" by this market: gpt-4o-2024-05-13

"gpt2-chatbot" models are now available at https://chat.lmsys.org and are reportedly at a SOTA quality level. There is speculation that it might be a shadow drop of a new OpenAI models to test their performance prior to release.

More info from 4chan: https://rentry.org/GPT2

Resolves as YES:

If gpt2-chatbot is confirmed by OpenAI as a new model that improves upon GPT-4 or another version such as GPT-4.5/5 or similar. ✔️

If gpt2-chatbot is a finetuned version of an older GPT-4 model or even an earlier model by OpenAI, provided it is better than the last version of GPT-4 (achieving a higher ELO in the overall category of the Chatbot arena leaderboard than gpt-4-turbo-2024-04-09). ✔️

Confirmation from OpenAI means either they have explicitly stated it, or they have announced a new model that has been proven to be gpt2-chatbot or a later iteration. ✔️

It counts even if gpt2-chatbot is renamed or removed from the Chatbot arena and reintroduced officially. ✔️

Resolves as NO:

If September 2024 ends without meeting the YES criteria.

If OpenAI denies that gpt2-chatbot is an OpenAI model.

If https://chat.lmsys.org states that it is not a model from OpenAI.

If another person or organization claims (with evidence) that gpt2-chatbot is from them.

OP Trading: Given the objective nature of this market’s resolution, I reserve the right to place bets. However, I will do so only after at least 5 trades or trade orders from different traders have been made, to avoid any unfair advantage.

RESOLVED as YES (May 16, 2024):

[image][tweet][tweet]

Manifold Markets · Accepted Answer

Yes — resolved on May 16, 2024 by Manifold Markets prediction market.

#	Trader	Total profit
1		Ṁ73
2		Ṁ60
3		Ṁ42
4		Ṁ40
5		Ṁ25

🏅 Top traders

People are also trading

People are also trading

Related questions