Will I be able to run DeepSeek-V3 100% locally on my home GPU rig by Jan 5th 2025 LIVESTREAMING IN DESCRIPTION!

10kṀ740k

resolved Jan 6

Resolved

ALL

Comment: "fuller disclosure I'm popping 200mg caffine pills, 15 grams of creatine, and 10 grams beta-alanine per protien shake."

Resolution Criteria:

This market resolves YES if Kearm successfully runs DeepSeek-V3 locally on his home GPU rig by 11:59 PM PT on January 5, 2025. Successful execution requires meeting the following benchmark criteria:

The GPU rig must achieve a benchmark score on the MMLU-Pro leaderboard that falls within an inclusive range of 3.5% above or below (a 7% total range) the official DeepSeek MMLU-Pro (EM) Exact Match benchmark score of 75.9.

The MMLU-Pro benchmark must be conducted using the EleutherAI LM Evaluation Harness, as specified here: https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/leaderboard

The official DeepSeek MMLU-Pro (EM) Exact Match benchmark score and methodology are detailed in the DeepSeek-V3 technical report: https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf

If Kearm fails to achieve this benchmark by the deadline, the market resolves NO, regardless of progress or pending tests.

Original Description:

I am attempting to run a top 3 OVERALL LLM(After more testling like the best model in the world right now) but specifically DeepSeek-V3 LOCALLY on my home GPU rig and attempt get comprable(Within either 7% MMLU-Pro) between either offical API or Open Router by the 5th of January 2025. Current predicted market costs to run this model "locally" as in you have a server and 240W in your house is would 8xH200 so ~$256,000 USD in GPU'S ONLY. Now add in everything else, CPU, motherboard, DRAM, and cooling/server rack. Livestreamining this whole shebang at https://x.com/i/broadcasts/1MnxnDkgaMyGO
This is the machine's specs: Driver is 550.120(I've found it to be the most stable)
I may drop down to Jammy/22.04 LTS if needed.

Update 2025-04-01 (PST) (AI summary of creator comment): - Created a working Q8_0.gguf file.
- Benchmarking not yet run for MMLU-Pro or ANY OTHER BENCHMARK even a simple perplexity test.
- Will use Eluther/lm-evaluation-harness commit 888ac292c5ef041bcae084e7141e50e154e1108a for MMLU-Pro benchmark unless a bug is found.
- Livestreaming continues and screen recording started.