Universal Codes

Lloyd Allison, Arun S. Konagurthu & Daniel F. Schmidt, 'On Universal Codes for Integers: Wallace Tree, Elias Omega and Beyond', Data Compression Conference (DCC), pp.313-322, doi:10.1109/DCC50243.2021.00039, 22-26 March 2021.

Above: CW:=enc(N) then check that dec(CW)=N, for each of several codes – Fibonacci, Elias's omega, our omega² and omega^*, and Wallace Tree Code – all for integers ≥0 or ≥1. Note, treat the timings with great caution as who knows when javascript's garbage collector kicks in.

Above: Code-word lengths for powers of 10.

N	fib1	omega1	WTC1
1	2	1*	1*
2	3*	3*	3*
3	4	3*	5
4	4*	6	5
13	7*	7*	9
16	7*	11	9
610	15*	17	15*
627	15*	17	17
1597	17*	18	17*
2057	17*	19	19
4181	19*	20	19*
6765	20	20	19*
6919	20*	20*	21
8192	20*	21	21
10946	21*	21*	21*
16384	21*	22	21*
17711	22	22	21*
23715	22*	22*	23
28657	23	22*	23
32768	23*	23*	23*
46368	24	23*	23*
65536	24	28	23*
82501	25*	28	25*
121393	26	28	25*
290513	27*	30	27*
317811	28	30	27*

Above: Code-word lengths, early points of change of the lead.

w	fib1	omega1	WTC1
1	0	0.5	0.5
2	0.25	0.5	0.5
3	0.375	0.75	0.625
4	0.5	0.75	0.625
10	0.859	0.875	0.754
100	0.999...	0.947	0.920
1000	-"-	0.957	0.975
10000	-"-	0.963	0.992
100000	-"-	0.9688	0.997
1000000	-"-	0.9692	0.9992

∑_{N s.t. |codeword(N)|≤w} p(N)

Above: Cummulative probabilities upto code-word length w (bits).

Is there an ultimate code?

Is there an ultimate code, i.e., a code that gives the shortest code-words, shorter than any other code gives, to sufficiently big integers?

Above: probabilities Pr(.) under code C

Any code C can be "stretched" to make another code C': If code C gives an integer n≥1 the code-word CW(n) then code-words for C' are defined by: CW'(2n-1) = CW(n) ++ '0', and
CW'(2n) = CW(n) ++ '1'.
and if Pr(m) is the probability of m under C and Pr'(m) the probability of m under C' then: Pr'(2n-1) = Pr'(2n) = Pr(n)/2.

If C is a prefix code so is C': The end of CW(n) can be detected and we expect one more bit for CW'(2n-1) and CW'(2n).

Above: probabilities Pr'(.) under code C'

If C is efficient the area under the curve D+E+F=1 and C' is also efficient, area G+H=1. Now the region marked G is region D shrunk by 1/2 vertically and stretched ×2 horizontally, so area D=G. Consequently H=E+F and area H>F. Put another way, C' assigns more probability (eventually gives shorter code-words) than C to big enough integers. And of course C' can itself be stretched, "improved".

So in that sense, no, there is no ultimate code.

Robustness

Above: Experiment with robustness; Ns→code-words→code-string→mutate(CW[0])→decode.
Key: || correct number of Ns, # wrong number of Ns, = late numbers correct, non-trivial remnant ⇒ v. bad (error msg shows why).

References

Lloyd Allison, Arun S. Konagurthu & Daniel F. Schmidt, 'On Universal Codes for Integers: Wallace Tree, Elias Omega and Beyond', Data Compression Conference (DCC), pp.313-322, doi:10.1109/DCC50243.2021.00039, 22-26 March 2021
Lloyd Allison, Arun Konagurthu and Daniel Schmidt, 'On Universal Codes for Integers: Wallace Tree, Elias Omega and Variations', arXiv:1906.05004, June 2019.