🐕 Open Source LLMs: Will Any Open Source LLM on the HuggingFace OpenLLM Leaderboard Significantly Gain in Avg Score?

4

90Ṁ205

resolved Jan 10

Resolved

NO

1H

6H

1D

1W

1M

ALL

Preface:

Please read the preface for this type of market and other similar third-party validated AI markets here.

Third-Party Validated, Predictive Markets: AI Theme

Market Description

Open LLM Leaderboard

As of the time of authoring this, HuggingFace recently released and OpenLLM Leaderboard with different benchmar measurements for different kinds of LLM performance shown within the rankings. Here's a snapshot from July 2023.

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

The average score is currently an average of ARC, HellaSwag, MMLU, TruthfulQA.

Resolution Criteria

Average Score will be calculated as the average of ARC, HellaSwag, MMLU, TruthfulQA and no other metrics, regardless of whether those metrics are removed or other ones are added to the above linked page.

Will any entry on the HuggingFace OpenLLM Leaderboard have an Average score equal to 1.1*(Current Average Top Score) by the end of the year?

As of the time of writing, the current Average Top Score, A = 71.4
A*1.1 = 78.54

This market resolves as YES if A*1.1 >= 78.54 at the time of market closing.

Technical AI Timelines

Third Party Validated, Predictive Markets: AI

New Year's Resolutions 2024

Third Party Validated, Predictive Markets

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ25
2		Ṁ24

Sort by:

New market on this same topic: https://manifold.markets/PatrickDelaney/-openllms-will-any-open-source-llm

We're at 76.66, below threshold, resolves NO.

Trying to resolve market but space has been unavailable.

People are also trading

🐕 OpenLLMs: Will Any Open Source LLM on the HuggingFace OpenLLM Leaderboard Significantly Gain in Avg Score by YE 2024?

EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?

Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

What will be true of OpenAI's best LLM by EOY 2025?

Which AI companies will release a SoTA LLM on AidanBench in 2025?

Who will have the best LLM at the end of 2025 (as decided by ChatBot Arena)?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will China have the best open LLM at EOY?

When will a non-Transformer model become the top open source LLM?

Related questions

🐕 OpenLLMs: Will Any Open Source LLM on the HuggingFace OpenLLM Leaderboard Significantly Gain in Avg Score by YE 2024?

EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?

Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

What will be true of OpenAI's best LLM by EOY 2025?

Which AI companies will release a SoTA LLM on AidanBench in 2025?

Who will have the best LLM at the end of 2025 (as decided by ChatBot Arena)?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will China have the best open LLM at EOY?

When will a non-Transformer model become the top open source LLM?

© Manifold Markets, Inc.•Terms•Privacy