How will people run LLaMa 3 405B locally by 2025?
11
565Ṁ317
Jan 1
91%
Gaming GPUs + heavy quantization (e.g. 6x4090 @ Q2_0)
65%
Unified memory (e.g. Apple M4 Ultra)
60%
Tensor GPUs + modest quantization (e.g. 4xA100 2U rackmount)
60%
Distributed across clustered machines (e.g. Petals)
41%
Server CPU (e.g. AMD EPYC with 512TB DDR5)

"Cloud" is a boring answer. User base of interest is somewhere between hobbyists with a budget and companies with a couple of self-hosted racks.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy