Will there exist a service for full-parameter fine-tuning of Llama 3.1 405B?

10kṀ64k

2026

34%

chance

ALL

Resolves YES if a service or API exists before 2026 which does the following:

You upload text document(s) for fine-tuning
The service does full parameter fine-tuning on Llama 3.1 405B, without you having to rent GPUs
You can download the fine-tuned model

Extra Details:

I’ll accept a service which finetunes a model other than Llama 3.1 405B, so long as it is more capable (as judged by e.g. the lmsys leaderboard) and has at least 400B parameters. E.g. if Meta releases Llama 4 405B, that counts
The service must accept a large corpus of documents, >= 10B tokens in total

By full-parameter fine-tuning, I’m excluding methods like LoRA/QLoRA which do low rank updates. Similarly I’m excluding memory efficient optimizers (e.g. AdaFactor). The service should use AdamW or a similar optimizer without excessive quantization; all optimizer states, gradients, parameters, buffers etc. should be in at least 8 bit precision.
It should be generally available, e.g. like the OpenAI fine-tuning API. You shouldn’t have to consult with people for your specific fine-tuning job before using it.

If you think such a service exists, please post in the comments below. I’ll make a reasonable effort to confirm that it meets the requirements and resolve if it does.

OpenAI

Technical AI Timelines

LLMs

Large language models

Get

1,000

to start trading!

People are also trading

How many active parameters will the largest Llama 3 have?

77% chance

Will the best LLM in 2025 have <500 billion parameters?

23% chance

Will the best LLM in 2025 have <1 trillion parameters?

42% chance

Will the best LLM in 2027 have <1 trillion parameters?

26% chance

When will the first quadrillion parameter LLM be made?

Will the best LLM in 2027 have <250 billion parameters?

12% chance

Will the best LLM in 2027 have <500 billion parameters?

Sort by:

bought Ṁ750 YES

I’ll accept a service which finetunes a model other than Llama 3.1 405B, so long as it is more capable (as judged by e.g. the lmsys leaderboard) and has at least 400B parameters. E.g. if Meta releases Llama 4 405B, that counts

any model better than 405B? including non-meta? whatttt i didn't read it like that the first time

@Bayesian I haven’t changed the description since I created the market, but I agree that this could be a bit confusing given the title. To be clear, a non-meta model counts as long as it has 400B+ total parameters and satisfies the other criteria.

bought Ṁ1,000 NO

I was not able to find a service like this and Perplexity Pro also said that such thing does not exist. I don't think anybody wants it in 2025 given than Llama 3.3 and Qwen 2.5 exist, so I am betting NO

sold Ṁ606 NO

wait I missed "more capable model" clause... Still betting NO but not as much

bought Ṁ30 YES

This is a great question

Does this work? https://azuremarketplace.microsoft.com/en-us/marketplace/apps/metagenai.meta-llama-3-1-405b-instruct-offer?tab=Overview

Possibly, looking into it now

Hm, possibly they aren't offering fine thing quite yet, or at least not outside of a preview? Other pages mention only 3.1 70b and smaller and they don't give pricing for fine-tuning that model as distinct from inference.

I think that’s right. According to this source, fine tuning isn’t available yet:

“Can I fine-tune the Llama 3.1 405B model? What about other models?

Not yet for 405B Instruct – stay tuned!

Models available to fine-tune today:

Deployment as serverless API (MaaS): 8B Instruct and 70B Instruct.

Deployment as managed compute: 8B Instruct, 70B Instruct, 8B, 70B.”

But if they do offer a full fine-tuning service sometime in the future, and you could download the resulting model, that would be sufficient to resolve YES.

Does what they offer for smaller models now qualify as full fine-tuning?

Good question, I don’t think so. From the Azure documentation, “We use LoRA, or low rank approximation, to fine-tune models in a way that reduces their complexity without significantly affecting their performance.”

https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233

This indicates that at least for llama 2 you could turn off the Lora for fine-tuning (the animation of them going through the process included a Lora checkbox in the settings), though I suppose that might not be the case for openai models. Also if they provide Lora but allow you to choose the parameters for it you could maybe choose a rank equal to the actual model dimension which would make it the same as no-lora fine tuning iiiuc.

That’s a good point, didn’t notice that. The llama 2 example certainly counts as full fine-tuning since you can disable LoRA. I’m inclined to say that full-rank LoRA counts as well after rereading the LoRA paper.

@mr_mino I think databricks now offers both continued pre-training of Llama 405b (which I think is the same as full parameter finetuning?), with an option to do it on serverless compute. However the documentation is a bit confusing and it's not being offered as pay-per-token yet, just pay per "DBU" which I gather is related to how much compute you use.

@Fay42 I think there’s also a data requirement which would disqualify this one:

For continuous pre-training, workloads are limited to 60-256MB files.

Large datasets (10B+ tokens) are not supported due to compute availability

Contra the requirement:

The service must accept a large corpus of documents, >= 10B tokens in total

@mr_mino whoops, didn't notice that limit - sorry

bought Ṁ50 NO

Do you think this make sense? You can use LoRA technique and get 90% of results at 3% of cost.

There are some domains (e.g. healthcare, finance, automating various high-paying jobs) where the extra 10% of performance/reliability is worth the cost. But I’d be interested in an unconditional market as well.

what do you mean, "without renting gpus"? You don't mean free, do you?

I guess it means you pay them a one-time fee and they use GPUs that they own/rented themselves

I mean that you shouldn’t have to rent or buy hardware in order to use the fine-tuning services; it does so on your behalf. E.g. like the OpenAI/Mistral API. I expect it to cost money.

People are also trading

How many active parameters will the largest Llama 3 have?

77% chance

Will the best LLM in 2025 have <500 billion parameters?

23% chance

Will the best LLM in 2025 have <1 trillion parameters?

42% chance

Will the best LLM in 2027 have <1 trillion parameters?

26% chance

When will the first quadrillion parameter LLM be made?

Will the best LLM in 2027 have <250 billion parameters?

12% chance

Will the best LLM in 2027 have <500 billion parameters?

13% chance

People are also trading

People are also trading

Related questions