an LLM as capable as GPT-4 will run on a 3090 before Sept 14th
Basic
30
Ṁ17k
resolved Sep 30
Resolved
NO

e.g. WinoGrande >= 87.5%

Get
Ṁ1,000
and
S3.00
Sort by:
predicted NO

llama2 paper is winograd 80% winograd vs previous 76%, still possible improvements will happen in the next week

GPT4 but 98% cheaper: "Our experiments show that FrugalGPT can match the performance of the best individual LLM (e.g. GPT-4) with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost." https://arxiv.org/abs/2305.05176

@b9cd It has nothing to do with the question. It's not even an LLM! It's almost like saying that I can access GPT-4 API from my phone.

predicted YES

@qumeric LLM cascade is LLM. LLM with prompt adaptation is LLM. LLM that uses some amount of stored answers from another LLM or LLM fine tuned on results of other LLM is still LLM. You can use this tricks to adjust quality of your model while having less parameters, right? I'm not saying this is a direct solution to question, but it definetely seems like related research

predicted NO

@b9cd I mean it's not a new LLM, it's a technique which uses existing LLMs.

I don't see how cost savings (which is the most impressive part) are very relevant here, it doesn't really change anything regarding models run on RTX 3090.

Things like CoT or even LLM cascade may be relevant, but I am not sure. Would this question resolve as yes if we will find some way to e.g. augment prompts which will make LLaMA-30B as capable as GPT-4 without prompt augmentation?

I agree it's somewhat related research, but seems like weak evidence to me. Interesting paper nonetheless.

gpt3 equivalent can run on a raspberry pi now

@WieDan Gpt-3 and Gpt-4 is 2.5 years away. Also, even gpt-3.5-turbo is 30 times cheaper per token than Gpt-4.

predicted YES

@qumeric This progress was made in a space for 5 weeks. We're at the start of a cambrian explosion here.

predicted NO

@WieDan Alphabet had GPT-3 level models in early 2021, possibly even 2020. Alphabet still has no GPT-4 level model and is probably not going to have it this year, especially prior to the resolution date. Today is the Google I/O though, so there is a chance I will be surprised but it's thin and even if it will happen, it will still only increase my odds here from 2% to perhaps 5%.

Nobody except OpenAI has GPT-4 level models, closed or open source, run on 3090 or a huge cluster.

What progress do you mean really? The main improvement is llama.cpp and similar stuff, it's just a bunch of optimization tricks which already mostly reached their limits. LLaMA itself is not even that great model, it's a bit worse than gpt-3.5-turbo IMO.

It is likely that GPT-4 level models in the next few years are simply not going to fit into 24GB even with 4bit quantization. The absolute limit is around 40B parameters, maybe 50B.

predicted YES

@qumeric regarding LLaMA quality you're probably right, I also saw this small set of logic related questions https://docs.google.com/spreadsheets/d/1NgHDxbVWJFolq8bLvLkuPWKC7i_R6I6W/edit#gid=719051075 where all popular open source LLMs fail compared to ChatGPT as demonstation of that

Would offloading part of the model to CPU count satisfy the resolution criteria?

predicted YES

@cherrvak I assume "run on a 3090" means "run on a consumer PC with a single 3090 and a cpu, primarily using the gpu"

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules