an LLM as capable as GPT-4 will run on a 3090 before Sept 14th

590Ṁ17k

resolved Sep 30

Resolved

ALL

e.g. WinoGrande >= 87.5%

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ806
2		Ṁ166
3		Ṁ93
4		Ṁ39
5		Ṁ36

People are also trading

Which next-gen frontier LLMs will be released before GPT-5? (2025)

Will it be possible to run an LLM of GPT-4 (or higher) capability on a portable device by 2027?

99% chance

China will make a LLM approximately as good or better than GPT4 before 2025

89% chance

Size of smallest open-source LLM marching GPT 3.5's performance in 2025? (GB)

1.83

When will an open-source LLM be released with a better performance than GPT-4?

Will xAI develop a more capable LLM than GPT-5 before 2026

68% chance

GPT-4 performance and compute efficiency from a simple architecture before 2026

Sort by:

predictedNO

llama2 paper is winograd 80% winograd vs previous 76%, still possible improvements will happen in the next week

GPT4 but 98% cheaper: "Our experiments show that FrugalGPT can match the performance of the best individual LLM (e.g. GPT-4) with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost." https://arxiv.org/abs/2305.05176

@b9cd It has nothing to do with the question. It's not even an LLM! It's almost like saying that I can access GPT-4 API from my phone.

predictedYES

@qumeric LLM cascade is LLM. LLM with prompt adaptation is LLM. LLM that uses some amount of stored answers from another LLM or LLM fine tuned on results of other LLM is still LLM. You can use this tricks to adjust quality of your model while having less parameters, right? I'm not saying this is a direct solution to question, but it definetely seems like related research

predictedNO

@b9cd I mean it's not a new LLM, it's a technique which uses existing LLMs.

I don't see how cost savings (which is the most impressive part) are very relevant here, it doesn't really change anything regarding models run on RTX 3090.

Things like CoT or even LLM cascade may be relevant, but I am not sure. Would this question resolve as yes if we will find some way to e.g. augment prompts which will make LLaMA-30B as capable as GPT-4 without prompt augmentation?

I agree it's somewhat related research, but seems like weak evidence to me. Interesting paper nonetheless.

gpt3 equivalent can run on a raspberry pi now

@WieDan Gpt-3 and Gpt-4 is 2.5 years away. Also, even gpt-3.5-turbo is 30 times cheaper per token than Gpt-4.

predictedYES

@qumeric This progress was made in a space for 5 weeks. We're at the start of a cambrian explosion here.

predictedNO

@WieDan Alphabet had GPT-3 level models in early 2021, possibly even 2020. Alphabet still has no GPT-4 level model and is probably not going to have it this year, especially prior to the resolution date. Today is the Google I/O though, so there is a chance I will be surprised but it's thin and even if it will happen, it will still only increase my odds here from 2% to perhaps 5%.

Nobody except OpenAI has GPT-4 level models, closed or open source, run on 3090 or a huge cluster.

What progress do you mean really? The main improvement is llama.cpp and similar stuff, it's just a bunch of optimization tricks which already mostly reached their limits. LLaMA itself is not even that great model, it's a bit worse than gpt-3.5-turbo IMO.

It is likely that GPT-4 level models in the next few years are simply not going to fit into 24GB even with 4bit quantization. The absolute limit is around 40B parameters, maybe 50B.

predictedYES

@qumeric regarding LLaMA quality you're probably right, I also saw this small set of logic related questions https://docs.google.com/spreadsheets/d/1NgHDxbVWJFolq8bLvLkuPWKC7i_R6I6W/edit#gid=719051075 where all popular open source LLMs fail compared to ChatGPT as demonstation of that