Optimal DEFLATE compressor by end of 2024
17
370Ṁ715
resolved Apr 23
Resolved as
10%

By the end of 2024, will there be open source software that I can run on my computer that will find the optimal DEFLATE encoding for any piece of data up to 1000000 bytes that I test it with in under an hour? (i.e. no naive brute force scripts)

https://www.w3.org/Graphics/PNG/RFC-1951

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ33
2Ṁ12
3Ṁ9
4Ṁ6
5Ṁ5
Sort by:

In light of Manifold's "pivot", I don't feel comfortable using this site, so I'm leaving and resolving all my open markets N/A. Sorry everyone.

@ThisProfileDoesntExist Update: It won't let me resolve N/A, so I'm resolving to current probability (10%) instead.

predictedNO

There's past efforts to find better compression

https://developers.googleblog.com/2013/02/compress-data-more-densely-with-zopfli.html

or reason about optimality

https://www.sciencedirect.com/science/article/pii/S1570866713000257

but sticking to DEFLATE compatibility is fundamentally limiting.

On a Hutter prize benchmark (http://mattmahoney.net/dc/text.html) the closest DEFLATE-compatible (pigz in `-11` Zopfli mode) is 2x-3x larger than state of the art. It's ~30% larger than compatibility-breaking LZ77 implementations, or other algorithms with very low cost decompression.

It thus seems like any cloud provider motivated to chase the size-speed tradeoff is more likely to bolt new compression standards onto HTTP2 or play with self-extracting WASM formats than chase diminishing returns (size-bounded, but still) on DEFLATE optimality.

Define “optimal”

You are aware that Kolmogorov complexity is uncomputable? It is not possible to find THE optimal compressor by any algorithm

@AdamTreat This is about a specific compression format, not Kolmogorov complexity. The optimal representation is the one with the fewest number of bytes among all possible representations for a given data string in the DEFLATE format.

@ThisProfileDoesntExist I don’t know enough about the DEFLATE format but if it is capable of encoding any arbitrary piece of data just like a bit string and you are asking for an algorithm on a Turing machine to find the optimal representation that sounds an awful lot like you are asking for an algorithm to procure the optimal Kolmogorov complexity of a bit string. Again, I don’t know enough about the DEFLATE so maybe I’m missing something.

@AdamTreat The difference is that DEFLATE is not Turing complete. There's a limited and well defined number of ways that any given piece of data can be encoded. You can trivially write a script to brute force this, it just won't run in a reasonable amount of time.

@AdamTreat For comparison, consider a compression format that is just "use a single byte-by-byte binary prefix code to encode all the data". There are well known algorithms to solve this optimally (https://en.wikipedia.org/wiki/Huffman_coding). DEFLATE is more complicated than that, but it's still simple enough that it could plausibly by solved.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules