A pure binary LLM will exist by end of 2024
13
735
แน€420
2025
71%
chance

A pure binary neural net is a neural network represented as pure combinatorial logic. Naively unrolling multi bit floating point/integer multiplication to binary does not count, the weights and activations must be binary. I will arbitrarily declare that integer weights of 3 bits or fewer are permitted to be unrolled. But note that the whole model end to end must be reduced to logic gates.

For example [Unrolling Ternary Neural Networks](https://arxiv.org/abs/1909.04509) almost satisfies the definition but uses patches and hence does not quite count. (Also I'm interested in language models not image models.)

It does not matter how the model was trained, only that it has adequate accuracy when in binarized form.

Resolves YES if a pure binary language model with bits per byte accuracy on The Pile better than or equal to GPT-2 (1.225 BPB) exits. It does not need to be publicly accessible as long as it is reported by a credible source (Deepmind, OpenAI, ElutherAI, etc).

Resolve NO if there is no credible report of such a model.

Get แน€200 play money
Sort by:

We have less than a year left. I have sold my stake in this market and will not bet further on it in case it ends up being subjective.

My personal efforts in this space have not been as successful as I had hoped. Personally, I think the market is ~well priced? I will be interested to see how this resolves.

Trit weights https://arxiv.org/abs/2402.17764

Still 8 bit activations, so does not qualify, but a sparse weights matrix should compact down much nicer.

predicts YES
predicts YES

Looks like we will be getting 3 bit quantized LLaMA soon:
- https://arxiv.org/abs/2210.17323
- https://news.ycombinator.com/item?id=35107058

Now all that remains to resolve this market is to somehow quantize the softmaxes, and then unroll the whole thing to combinational logic.

predicts YES

Ultra-low Precision Multiplication-free Training for Deep Neural Networks: https://arxiv.org/abs/2302.14458

1 sign bit, 4 exponent bits. Looks like it works on transformer language models. I am unclear on how they handle the softmaxes however. To resolve this market, the softmaxes would need to be fully transformed to combinational logic.

predicts YES

https://arxiv.org/abs/2212.09720

We are beginning to get down to 4 bit weights.

However note that even if the weights were 3 bit, the model would need to be fully reduced to combinational logic, including any softmaxes etc, to resolve YES.