DeepMind's DiLoCo - how big of a blow to compute governance?

Subjective for now as an MVP, might operationalize further/involve others in resolving as/if significant investment comes in - let me know your ideas!

Resolves Yes if compute governance turns out to be negligibly useful for reducing catastrophic AI risk because of DiLoCo style distributed training (or subsequent methods that build on it).
Resolves No if DiLoCo style distributed training turns out to have negligible impact on the success of compute governance.
Resolves N/A if compute governance turns out to be useless for reducing catastrophic AI risk regardless of DiLoCo.
But more likely than not, this will resolve to X%: my eventual estimate of how much less effective compute governance is given that DiLoCo style distributed training is possible, relative to a world where it wasn't.

My current thinking:
- There's a small chance that distributed training ultimately makes compute governance unviable. There's a reasonable chance that it makes it moderately less effective, by reducing by, let's say, an order of magnitude the concentrations of compute you need to target.
- From my read and chatting to a few compute governance folks, the DiLoCo paper seems moderately overhyped - seems like you still want workers colocated (which will become a bigger deal if models continue to scale), and some of the empirical results don't really compare apples to oranges (e.g. by messing around with optimizers).

I'll start exiting my position if this hits 20 traders, to avoid COI in resolving.

---

Here's some background (thanks Bing):

Compute governance is the subfield of AI governance concerned with controlling and governing access to computational resources. One of the main challenges of compute governance is to prevent the misuse of powerful AI models by malicious actors, such as hostile states or terrorists. To do so, compute governance advocates for restricting access to the most advanced chips and supercomputers that are needed to train large foundation models.

DiLoCo is a distributed optimization algorithm that enables training of language models on islands of devices that are poorly connected. DiLoCo reduces the communication between devices by 500 times compared to standard approaches, while achieving similar performance. DiLoCo also allows for dynamic and heterogeneous resources, meaning that devices can join or leave the training process at any time.

DiLoCo might be a blow to compute governance because it makes it easier for anyone to train LLMs without relying on centralized and regulated supercomputers. This could undermine the efforts of compute governance to ensure the safe and ethical use of AI models. DiLoCo could also pose a threat to the national security of countries that rely on export controls and sanctions to limit the access of their adversaries to advanced chips and supercomputers

Related questions