Arithmetic encoding is one of those obscure tools that is rarely used, but every now and then is the only thing that can do the job. It’s the oddly sized Allen-key of network data compression.
Arithmetic encoding is cool because not many people have heard of it, and at first glance it doesn’t seem like it ought to work at all. A great way to impress girls at parties!
Consider this data:
(yes, we’re making a game that will finally answer the eternal question of young males the world over, “who would win in a fight between a cougar and a crocodile?”)
There are 11 possible species, so the Species enum requires 4 bits. Along with the boolean whoWon value, this is too much to be bitpacked into a single byte.
But hang on a moment…
With 11 possible values for animalA, and 11 possible values for animalB, and 2 possible values for whoWon, that only gives 11 * 11 * 2 possible permutations, which is 242. Seems like this ought to fit into a byte, neh?
The problem with bitpacking this data is the wastefulness of using 4 bits for each species, which gives us 16 possible values when we really only needed 11. Wouldn’t it be nice if we could use a fractional number of bits per field?
If we replace the bit shifting operations from my previous post with multiplication, division, and modulus calculations, we can do exactly that.
Pack the data like so:
void ArithmeticEncode(ref int encoded, int valueRange, int value)
encoded *= valueRange;
encoded += value;
int encoded = 0;
ArithmeticEncode(ref encoded, 11, (int)animalA);
ArithmeticEncode(ref encoded, 11, (int)animalB);
ArithmeticEncode(ref encoded, 2, whoWon ? 1 : 0);
And to read it back:
int ArithmeticDecode(ref int bitfield, int valueRange)
int value = bitfield % valueRange;
bitfield /= valueRange;
int encoded = packetReader.ReadByte();
whoWon = ArithmeticDecode(ref encoded, 2) != 0;
animalA = (Species)ArithmeticDecode(ref encoded, 11);
animalB = (Species)ArithmeticDecode(ref encoded, 11);