SlideShare a Scribd company logo
A Numerical Method for the Evaluation
     of Kolmogorov Complexity

                                Hector Zenil
                                Amphith´ˆtre Alan M. Turing
                                          ea
                       Laboratoire d’Informatique Fondamentale de Lille
                                      (UMR CNRS 8022)




 Hector Zenil (LIFL)          A Numerical Method for the Evaluation of Kolmogorov Complexity   1 / 39
Foundational Axis


As pointed out by Greg Chaitin (thesis report of H. Zenil):

    The theory of algorithmic complexity is of course now widely
    accepted, but was initially rejected by many because of the fact
    that algorithmic complexity is on the one hand uncomputable
    and on the other hand dependable on the choice of universal
    Turing machine.


This last drawback is specially restrictive for real world applications
because the dependency is specially true for short strings, and a solution
to this problem is at the core of this work.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   2 / 39
Foundational Axis (cont.)




The foundational departure point of the thesis is based in a rather but
apparent contradiction, pointed out by Greg Chaitin (same thesis report):

    ... the fact that algorithmic complexity is extremely, dare I say
    violently, uncomputable, but nevertheless often irresistible to
    apply ...




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   3 / 39
Algorithmic Complexity


Foundational Notion
                   A string is random if it is hard to describe.
                  A string is not random if it is easy to describe.



Main Idea
   The theory of computation replaces descriptions with programs. It
         constitutes the framework of algorithmic complexity:
                          description ⇐⇒ computer program




    Hector Zenil (LIFL)       A Numerical Method for the Evaluation of Kolmogorov Complexity   4 / 39
Algorithmic Complexity (cont.)


Definition
[Kolmogorov(1965), Chaitin(1966)]

                          K (s) = min{|p|, U(p) = s}

The algorithmic complexity K (s) of a string s is the length of the shortest
program p that produces s running on a universal Turing machine U.
The formula conveys the following idea: a string with low algorithmic
complexity is highly compressible, as the information that it contains can
be encoded in a program much shorter in length than the length of the
string itself.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   5 / 39
Algorithmic Randomness

Example
The string 010101010101010101 has low algorithmic complexity because it
can be described as 18 times 01, and no matter how long it grows in
length, if the pattern repeats the description (k times 01) increases only
by about log (k), remaining much shorter than the length of the string.



Example
The string 010010110110001010 has high algorithmic complexity because
it doesn’t seem to allow a (much) shorter description other than the string
itself, so a shorter description may not exist.



    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   6 / 39
Example of an evaluation of K
The string 01010guatda.com/cmx.p101...01 can be produced by the following program:

Program A:
1: n:= 0
2: Print n
3: n:= n+1 mod 2
4: Goto 2

The length of A (in bits) is an upper bound of K (010guatda.com/cmx.p101...01).

Connections to predictability: The program A trivially allows a shortcut to
the value of an arbitrary digit through the following function f(n):
                   if n = 2m then f (n) = 1, f (n) = 0 otherwise.

Predictability characterization (Shnorr) [Downey(2010)]
simple ⇐⇒ predictable
random ⇐⇒ unpredictable
    Hector Zenil (LIFL)     A Numerical Method for the Evaluation of Kolmogorov Complexity   7 / 39
Noncomputability of K

The main drawback of K is that it is not computable and thus can only be
approximated in practice.


Important
No algorithm can tell whether a program p generating s is the shortest
(due to the undecidability of the halting problem of Turing machines).


No absolute notion of randomness
It is impossible to prove that a program p generating s is the shortest
possible, also implying that if a program is about the length of the original
string one cannot tell whether there is a shorter program producing s.
Hence, there is no way to declare a string to be truly algorithmic random.


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   8 / 39
Structure vs. randomness



Formal notion of structure
One can exhibit, however, a short program generating s (much) shorter
than s itself. So even though one cannot tell whether a string is random
one can declare s not random if a program generating s is (much) shorter
than the length of s.


As a result, one can only find upper bounds of K and s cannot be more
complex than the length of that shortest known program producing s.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   9 / 39
Most strings have maximal algorithmic complexity

Even if one cannot tell when a string is truly random it is known most
strings cannot have much shorter generating programs by a simple
combinatoric argument:
    There are exactly 2n bit strings of length n,
    But there are only 20 + 21 + 22 + . . . + 2(n−1) = 2n − 1 bit strings of
    fewer bits. (in fact there is one that cannot be compressed even by a
    single bit)
    Hence, there are considerably less short programs than long programs.


Basic notion
One can’t pair-up all n-length strings with programs of much shorter length
(there simply aren’t enough short programs to encode all longer strings).


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   10 / 39
The choice of U matters
A major criticism brought forward against K is its dependence of universal
Turing machine U. From the definition:

                          K (s) = min{|p|, U(p) = s}

It may turn out that:

     KU1 (s) = KU2 (s) when evaluated respectively using U1 and U2 .


Basic notion
This dependency is particularly troubling for short strings, shorter than for
example the length of the universal Turing machine on which K of the
string is evaluated (typically in the order of hundreds of bits as originally
suggested by Kolmogorov himself).


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   11 / 39
The Invariance theorem
A theorem guarantees that in the long term different algorithmic
complexity evaluations will converge to the same values as the length of
the strings grow.

Theorem
Invariance theorem If U1 and U2 are two (universal) Turing machines and
KU1 (s) and KU2 (s) the algorithmic complexity of a binary string s when
U1 or U2 are used respectively, there exists a constant c such that for all
binary string s:

                             |KU1 (s) − KU2 (s)| < c
           (think of a compiler between 2 programming languages)

Yet, the additive constant can be arbitrarily large, making unstable (if not
impossible) to evaluate K (s) for short strings.

    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   12 / 39
Theoretical holes


  1   Finding a stable framework for calculating the complexity of short
      strings (one wants to have short strings like 000...0 to be always
      among the less algorithmic random despite any choice of machine.
  2   Pathological cases: Theory says that a single bit has maximal random
      complexity because the greatest possible compression is evidently the
      bit itself (paradoxically it is the only finite string for which one can be
      sure it cannot be compressed further), yet one would intuitively say
      that a single bit is among the simplest strings.

We try to fill these holes by introducing the concept of algorithmic
probability as an alternative evaluation tool for calculating K (s).




      Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   13 / 39
Algorithmic Probability


There is a measure that describes the expected output of a random
program running on a universal Turing machine.


Definition
[Levin(1977)]
m(s) = Σp:U(p)=s 1/2|p| i.e. the sum over all the programs for which U (a
prefix free universal Turing machine) with p outputs the string s and halts.


m is traditionally called Levin’s semi-measure, Solomonof-Levin’s
semi-measure or the Universal distribution [Kirchherr and Li(1997)].




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   14 / 39
The motivation for Solomonoff-Levin’s m(s)


Borel’s typewriting monkey metaphor1 is useful to explain the intuition
behind m(s):

If you were going to produce the digits of a mathematical constant like π
by throwing digits at random, you would have to produce every digit of its
infinite irrational decimal expansion.

If you place a monkey on a typewriter (with say a 50 keys typewriter), the
probability of the monkey typing an initial segment of 2400 digits of π by
chance is (1/502400 ).




   1´
    Emile Borel (1913) “M´canique Statistique et Irr´versibilit´” and (1914)
                         e                          e          e
“Le hasard”.
     Hector Zenil (LIFL)    A Numerical Method for the Evaluation of Kolmogorov Complexity   15 / 39
The motivation for Solomonoff-Levin’s m(s) (cont.)

But if instead, the monkey is placed on a computer, the chances of
producing a program generating the digits of π are of only 1/50158
because it would take the monkey only 158 characters to produce the first
2400 digits of π using, for example, this C language code:

     int a = 10000, b, c = 8400, d, e, f[8401], g; main(){for(; b-c; )
f[b + +] = a/5; for(; d = 0, g = c ∗ 2; c- = 14, printf(“%.4d”, e + d/a),
  e = d%a)for(b = c; d+ = f[b] ∗ a, f[b] = d%–g, d/ = g–, –b; d∗ = b);

Implementations in any programming language, of any of the many known
formulae of π are shorter than the expansions of π and have therefore
greater chances to be produced by chance than producing the digits of π
one by one.


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   16 / 39
More formally said

Randomly picking a binary string s of length k among all (uniformly
distributed) strings of the same length has probability 1/2k .
But the probability to find a binary program p producing s (upon halting),
among binary programs running on a Turing machine U is at least 1/2|p|
such that U(p) = s (we know that such a program exists because U is a
universal Turing machine)
Because |p| ≤ k (e.g. the example for π described before), a string s with
a short generating program will have greater chances to have been
produced by p rather than by writing down all k bits of s one by one.
The less random a string the more likely to be produced by a short
program.



    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   17 / 39
Towards a semi-measure
However, there is an infinite number of programs producing s, so the
probability of picking a program producing s among all possible programs
is ΣU(p)=s 1/2|p| , the sum of all the programs producing s running on the
universal Turing machine U.
Nevertheless, for a measure to be a probability measure, the sum of all
possible events should add up 1. So ΣU(p)=s 1/2|p| cannot be a probability
measure given that there is an infinite number of programs contributing to
the overall sum. For example, the following two programs 1 and 2 produce
the string 0.
1: Print 0
and:
1: Print 0
2: Print 1
3: Erase the previous 1
and there are (countably) infinitely many more.
       Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   18 / 39
Towards a semi-measure (cont.)

So for m(s) to be a probability measure, the universal Turing machine U
has to be a prefix-free Turing machine, that is a machine that does not
accept as a valid program one that has another valid program in its
beginning, e.g. program 2 starts with program 1, so if program 1 is a valid
program then program 2 cannot be a valid one.

The set of valid programs is said to form a prefix-free set, that is no
element is a prefix of any other, a property necessary to keep
0 < m(s) < 1. For more details see (Kraft’s inequality [Calude(2002)]).

However, some programs halt or some others don’t (actually, most do not
halt), so one can only run U and see what programs produce s
contributing to the sum. It is said then, that m(s) is semi-computable
from below, and therefore is considered a probability semi-measure (as
opposed to a full measure).


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   19 / 39
Some properties of m(s)



Solomonoff and Levin proved that, in absence of any other information,
m(s) dominates any other semi-measure and is, therefore, optimal in this
sense (hence also its universal adjective).

On the other hand, the greatest contributor in the summation of programs
ΣU(p)=s 1/2|p| is the shortest program p, given that it is when the
denominator 2|p| reaches its smallest value and therefore 1/2|p| its greatest
value. The shortest program p producing s is nothing but K (s), the
algorithmic complexity of s.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   20 / 39
The coding theorem

The greatest contributor in the summation of programs ΣU(p)=s 1/2|p| is
the shortest program p, given that it is when the denominator 2|p| reaches
its smallest value and therefore 1/2|p| its greatest value. The shortest
program p producing s is nothing but K (s), the algorithmic complexity of
s. The coding theorem [Levin(1977), Calude(2002)] describes this
connection between m(s) and K (s):

Theorem
                           K (s) = −log2 (m(s)) + c

Notice that the coding theorem reintroduces an additive constant! One
may not get rid of it, but the choices related to m(s) are much less
arbitrary than picking a universal Turing machine directly for K (s).



    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   21 / 39
An additive constant in exchange for a massive
computation
The trade-off this is, however, that the calculation of m(s) requires an
extraordinary power of computation.

As pointed out by J.-P. Delahaye concerning our method (Pour La
Science, No. 405 July 2011 issue):

     Comme les dur´es ou les longueurs tr`s petites, les faibles
                   e                       e
     complexit´s sont d´licates ` ´valuer. Paradoxalement, les
              e        e        ae
     m´thodes d’´valuation demandent des calculs colossaux.
       e         e

The first description of our approach was published in Greg Chaitin’s
festchrift volume for his 60th. anniversary: J-P. Delahaye & H. Zenil,
“On the Kolmogorov-Chaitin complexity for short sequences,” Randomness and
Complexity: From Leibniz to Chaitin, edited by C.S. Calude, World Scientific,
2007.

    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   22 / 39
Calculating an experimental m
Main idea
To evaluate K (s) one can calculate m(s). m(s) is more stable than K (s)
because one makes less arbitrary choices on a Turing machine U.

Definition
D(n) = the function that assigns to every finite binary string s the
quotient:
(# of times that a machine (n,2) produces s) / (# of machines in (n,2)).

D(n) is the probability distribution of the strings produced by all n-state
2-symbol Turing machines (denoted by (n,2)).
Examples for n = 1, n = 2 (normalized by the # of machines that
halt)
                            D(1) = 0 → 0.5; 1 → 0.5
                   D(2) = 0 → 0.328; 1 → 0.328; 00 → .0834 . . .
    Hector Zenil (LIFL)     A Numerical Method for the Evaluation of Kolmogorov Complexity   23 / 39
Calculating an experimental m (cont.)

Definition
[T. Rad´(1962)]
       o
A busy beaver is a n-state, 2-color Turing machine which writes a
maximum number of 1s before halting or performs a maximum number of
steps when started on an initially blank tape before halting.

Given that the Busy Beaver function values are known for n-state 2-symbol
Turing machines for n = 2, 3, 4 we could compute D(n) for n = 2, 3, 4.
We ran all 22 039 921 152 two-way tape Turing machines starting with a
tape filled with 0s and 1s in order to calculate D(4)2

Theorem
D(n) is noncomputable (by reduction to Rado’s Busy Beaver problem).

  2
      A 9-day calculation on a single 2.26 Core Duo Intel CPU.
       Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   24 / 39
Complexity Tables

Table: The 22 bit-strings in D(2) from 6 088 (2,2)-Turing machines that halt.
[Delahaye and Zenil(2011)]
                           0 → .328              010 → .00065
                           1 → .328              101 → .00065
                           00 → .0834            111 → .00065
                           01 → .0834            0000 → .00032
                           10 → .0834            0010 → .00032
                           11 → .0834            0100 → .00032
                           001 → .00098          0110 → .00032
                           011 → .00098          1001 → .00032
                           100 → .00098          1011 → .00032
                           110 → .00098          1101 → .00032
                           000 → .00065          1111 → .00032


Solving degenerate cases
“0” is the simplest string (together with “1”) according to D.
     Hector Zenil (LIFL)    A Numerical Method for the Evaluation of Kolmogorov Complexity   25 / 39
Partial D(4) (top strings)




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   26 / 39
From a Prior to an Empirical Distribution
We see algorithmic complexity emerging:
  1   The classification goes according to our intuition of what complexity
      should be.
  2   Strings are almost always classified by length except in cases in which
      intuition justifies they should not. For ex. even though 0101010 is of
      length 7, it came better ranked than some strings of length shorter
      than 7. One sees emerging the low random complexity of 010101...
      as a simple string.


From m to D
Unlike m, D is an empirical distribution and no longer a prior. D
experimentally confirms the intuition behind Solomonoff and Levin’s
measure.


Full tables are available online: www.algorithmicnature.org
      Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   27 / 39
Miscellaneous facts from D(3) and D(4)
   There are 5 970 768 960 machines that halt among the 22 039 921 152
   in (4,2). That is a fraction of 0.27 halt.
   Among the most random looking group strings from D(4) there are :
   0, 00, 000..., 01, 010, 0101, etc.
   Among the most random looking strings one can find:
   1101010101010101, 1101010100010101, 1010101010101011 and
   1010100010101011, each with frequency of 5.4447×10−10 .
   As in D(3), where we reported that one string group (0101010 and its
   reversion) climbed positions, in D(4) 399 strings climbed to the top
   and were not sorted among their length groups.
   In D(4) string length was no longer a classification determinant. For
   example, between positions 780 and 790, string lengths are: 11, 10,
   10, 11, 9, 10, 9, 9, 9, 10 and 9 bits.
   D(4) preserves the string order of D(3) except in 17 places out of 128
   strings in D(3) ordered from highest to lowest string frequency.
   Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   28 / 39
Connecting D back to m



To get m we replaced a uniform distribution of bits composing strings to a
uniform distribution bits composing programs. Imagine that your
(Turing-complete) programming language allows a monkey to produce
rules of Turing machines at random, every time that the monkey types a
valid program it is executed.


At the limit, the monkey (which is just a random source of programs) will
end up covering a sample of the space of all possible Turing machine rules.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   29 / 39
Connecting D back to m


On the other hand, D(n) for a fixed n is the result of running all n-state
2-symbol Turing machines according to an enumeration.

An enumeration is just a thorough sample of the space of all n-state
2-symbol Turing machines each with fixed probability
1/(# of Turing machines in (n,2)) (by definition of enumeration).

D(n) is therefore, a legitimate programmer monkey experiment. The
additional advantage of performing a thorough sample of Turing machines
by following an enumeration is that the order in which the machines are
traversed in the enumeration is irrelevant as long as one covers all the
elements of a (n,2) space.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   30 / 39
Connecting D back to m (cont.)


One may ask why shorter programs are favored.

The answer, in analogy to the monkey experiment, is based on the uniform
random distribution of keystrokes: programs cannot be that long without
eventually containing the ending program keystroke. One can still think
that one can impose a different distribution of the program instructions,
for ex. changing the keyboard distribution repeating certain keys.

Choices other than the uniform are more arbitrary than just assuming no
additional information, and therefore a uniform distribution (a keyboard
with two or more letter “a”’s rather than the usual one seems more
arbitrary than having a key per letter).




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   31 / 39
Connecting D back to m (cont.)

Every D(n) is a sample of D(n + 1) because (n + 1, 2) contains all
machines in (n, 2). We have empirically tested that strings sorted by
frequency in D(4) preserve the order of D(3) which preserves the order of
D(2), meaning that longer programs do not produce completely different
classifications. One can think of the sequence D(1), D(2), D(3), D(4), . . .
as samples which values are approximations to m.


One may also ask, how can we know whether a monkey provided with a
different programming language would produce a completely different D,
and therefore yet another experimental version of m. That may be the
case, but we have also shown that reasonable programming languages
(e.g. based on cellular automata and Post tag systems) produce
reasonable (correlated) distributions.


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   32 / 39
Connecting D back to m (cont.)




   Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   33 / 39
m(s) provides a formalization for Occam’s razor




The immediate consequence of algorithmic probability is simple but
powerful (and surprising):

Basic notion
                           Type-writing monkeys (Borel)
                             garbage in → garbage out
                      Programmer monkeys: (Bennett, Chaitin)
                            garbage in → structure out



    Hector Zenil (LIFL)     A Numerical Method for the Evaluation of Kolmogorov Complexity   34 / 39
What m(s) may tell us about the physical world?


Basic notion
m(s) tells that it is unlikely that a Rube Goldberg machine produces a
string if the string can be produced by a much simpler process.


Physical hypothesis
m(s) would tell that, if processes in the world are computer-like, it is
unlikely that structures are the result of the computation of a Rube
Goldberg machine. Instead, they would rather be the result of the shortest
programs producing that structures and patterns would follow the
distribution suggested by m(s).




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   35 / 39
On the algorithmic nature of the world

Could it be that m(s) tells us how structure in the world has come to be
and how is it distributed all around? Could m(s) reveal the machinery
behind?
What happens in the world is often the result of an ongoing (mechanical)
process (e.g. the Sun rising due to the mechanical celestial dynamics of
the solar system).
Can m(s) tell something about the distribution of patterns in the world?
We decided to see so we got some empirical datasets from the physical
world and made a comparison against data produced by pure computation
that by definition should follow m(s).

The results were published in H. Zenil & J-P. Delahaye, “On the
Algorithmic Nature of the World”, in G. Dodig-Crnkovic and M. Burgin (eds),
Information and Computation, World Scientific, 2010.


    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   36 / 39
On the algorithmic nature of the world




   Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   37 / 39
Conclusions


Our method aimed to show that reasonable choices of formalisms for
evaluating the complexity of short strings through m(s) give consistent
measures of algorithmic complexity.

    [Greg Chaitin (w.r.t our method)] ...the dreaded theoretical hole
    in the foundations of algorithmic complexity turns out, in
    practice, not to be as serious as was previously assumed.


Our method also seems notable in that it is an experimental approach that
comes into the rescue of the apparent holes left by the theory.




    Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   38 / 39
Bibliography
   C.S. Calude, Information and Randomness: An Algorithmic
   Perspective (Texts in Theoretical Computer Science. An EATCS
   Series), Springer, 2nd. edition, 2002.
   G. J. Chaitin. On the length of programs for computing finite binary
   sequences. Journal of the ACM, 13(4):547–569, 1966.
   G. Chaitin, Meta Math!, Pantheon, 2005.
   R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and
   Complexity, Springer Verlag, to appear, 2010.
   J.P. Delahaye and H. Zenil, On the Kolmogorov-Chaitin complexity for
   short sequences, in Cristian Calude (eds) Complexity and Randomness:
   From Leibniz to Chaitin. World Scientific, 2007.
   J.P. Delahaye and H. Zenil, Numerical Evaluation of Algorithmic
   Complexity for Short Strings: A Glance into the Innermost Structure
   of Randomness, arXiv:1101.4795v4 [cs.IT].
   Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   39 / 39
C.S. Calude, M.A. Stay, Most Programs Stop Quickly or Never Halt,
2007.
W. Kirchherr and M. Li, The miraculous universal distribution,
Mathematical Intelligencer , 1997.
A. N. Kolmogorov. Three approaches to the quantitative definition of
information. Problems of Information and Transmission, 1(1):1–7,
1965.
P. Martin-L¨f. The definition of random sequences. Information and
            o
Control, 9:602–619, 1966.
L. Levin, On a concrete method of Assigning Complexity Measures,
Doklady Akademii nauk SSSR, vol.18(3), pp. 727-731, 1977.
L. Levin., Universal Search Problems., 9(3):265-266, 1973.
(submitted: 1972, reported in talks: 1971). English translation in:
B.A.Trakhtenbrot. A Survey of Russian Approaches to Perebor
(Brute-force Search) Algorithms. Annals of the History of Computing
6(4):384-400, 1984.

Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   39 / 39
M. Li, P. Vit´nyi, An Introduction to Kolmogorov Complexity and Its
             a
Applications,, Springer, 3rd. Revised edition, 2008.
S. Lloyd, Programming the Universe: A Quantum Computer Scientist
Takes On the Cosmos, Knopf Publishing Group, 2006.
T. Rad´, On non-computable functions, Bell System Technical
      o
Journal, Vol. 41, No. 3, 1962.
R. J. Solomonoff. A formal theory of inductive inference: Parts 1 and
2. Information and Control, 7:1–22 and 224–254, 1964.
H. Zenil and J.P. Delahaye, On the Algorithmic Nature of the World,
in G. Dodig-Crnkovic and M. Burgin (eds), Information and
Computation, World Scientific, 2010.
S. Wolfram, A New Kind of Science, Wolfram Media, 2002.




Hector Zenil (LIFL)   A Numerical Method for the Evaluation of Kolmogorov Complexity   39 / 39

More Related Content

PPT
Cisco Router As A Vpn Server
PPTX
CONNECTED vehicle ECU.pptx
PPTX
Модуль 1 - вводная часть. seo и его роль в интернет-маркетинге
PPT
Strumenti per la valutazione dell'abilit%e0 di lettura v
PDF
Simaticpcs7 stpcs71 complete_english_2011
PDF
Batch implementation guide experion pks release 501 (honeywell)
PPSX
Honeywell Experion HS
PDF
Seven waystouseturtle pycon2009
Cisco Router As A Vpn Server
CONNECTED vehicle ECU.pptx
Модуль 1 - вводная часть. seo и его роль в интернет-маркетинге
Strumenti per la valutazione dell'abilit%e0 di lettura v
Simaticpcs7 stpcs71 complete_english_2011
Batch implementation guide experion pks release 501 (honeywell)
Honeywell Experion HS
Seven waystouseturtle pycon2009

Viewers also liked (12)

PDF
粒子物理與天文物理學簡介
PDF
Turtle Geometry the Python Way
PDF
蒙地卡羅模擬與志願運算
PDF
淺談編譯器最佳化技術
PDF
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
PDF
用 Python 玩 LHC 公開數據
PPT
Urban heat island
PDF
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
PDF
深入淺出C語言
PDF
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
PDF
資料科學計劃的成果與展望
PPT
OpenMP
粒子物理與天文物理學簡介
Turtle Geometry the Python Way
蒙地卡羅模擬與志願運算
淺談編譯器最佳化技術
[ZigBee 嵌入式系統] ZigBee 應用實作 - 使用 TI Z-Stack Firmware
用 Python 玩 LHC 公開數據
Urban heat island
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
深入淺出C語言
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
資料科學計劃的成果與展望
OpenMP
Ad

Similar to A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms (20)

PDF
Towards a stable definition of Algorithmic Randomness
PDF
Graph Spectra through Network Complexity Measures: Information Content of Eig...
PDF
Algoritmic Information Theory
PDF
Time series anomaly discovery with grammar-based compression
PPTX
Sample presentation slides
PDF
Lti system(akept)
PDF
Quantum Computation and Information
PDF
An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...
PDF
PDF
Algorithms 2 A Quickstudy Laminated Reference Guide 1st Edition Babak Ahmadi
PDF
U4301106110
PDF
ON THE DUALITY FEATURE OF P-CLASS PROBLEMS AND NP COMPLETE PROBLEMS
PDF
236628934.pdf
DOCX
Basic Computer Engineering Unit II as per RGPV Syllabus
PDF
Information Content of Complex Networks
PDF
Shor's discrete logarithm quantum algorithm for elliptic curves
PDF
A new RSA public key encryption scheme with chaotic maps
PPT
Lecture1
PPTX
DAA_Hard_Problems_(4th_Sem).pptxxxxxxxxx
DOCX
Introduction to complexity theory assignment
Towards a stable definition of Algorithmic Randomness
Graph Spectra through Network Complexity Measures: Information Content of Eig...
Algoritmic Information Theory
Time series anomaly discovery with grammar-based compression
Sample presentation slides
Lti system(akept)
Quantum Computation and Information
An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...
Algorithms 2 A Quickstudy Laminated Reference Guide 1st Edition Babak Ahmadi
U4301106110
ON THE DUALITY FEATURE OF P-CLASS PROBLEMS AND NP COMPLETE PROBLEMS
236628934.pdf
Basic Computer Engineering Unit II as per RGPV Syllabus
Information Content of Complex Networks
Shor's discrete logarithm quantum algorithm for elliptic curves
A new RSA public key encryption scheme with chaotic maps
Lecture1
DAA_Hard_Problems_(4th_Sem).pptxxxxxxxxx
Introduction to complexity theory assignment
Ad

More from Hector Zenil (20)

PPTX
Unit 2: All
PPTX
Unit 5: All
PPTX
Unit 6: All
PPTX
4.11: The Infinite Programming Monkey
PPTX
4.12: The Algorithmic Coding Theorem
PPTX
4.13: Bennett's Logical Depth: A Measure of Sophistication
PPTX
4.10: Algorithmic Probability & the Universal Distribution
PPTX
4.9: The Chaitin-Leibniz Medal
PPTX
4.8: Epistemological Aspects of Infinite Wisdom
PPTX
4.7: Chaitin's Omega Number
PPTX
4.6: Convergence of Definitions
PPTX
4.5: The Invariance Theorem
PPTX
4.4: Algorithmic Complexity and Compressibility
PPTX
4.3: Pseudo Randomness
PPTX
4.1 Sources of Randomness
PPTX
Unit 3: Classical information
PPTX
Unit 3: Shannon entropy and meaning
PPTX
Unit 3: Joint, Conditional, Mutual Information, & a Case Study
PPTX
Unit 3: Entropy rate, languages and multidimensional data
PPTX
Unit 3: Redundancy, noise, and biological information
Unit 2: All
Unit 5: All
Unit 6: All
4.11: The Infinite Programming Monkey
4.12: The Algorithmic Coding Theorem
4.13: Bennett's Logical Depth: A Measure of Sophistication
4.10: Algorithmic Probability & the Universal Distribution
4.9: The Chaitin-Leibniz Medal
4.8: Epistemological Aspects of Infinite Wisdom
4.7: Chaitin's Omega Number
4.6: Convergence of Definitions
4.5: The Invariance Theorem
4.4: Algorithmic Complexity and Compressibility
4.3: Pseudo Randomness
4.1 Sources of Randomness
Unit 3: Classical information
Unit 3: Shannon entropy and meaning
Unit 3: Joint, Conditional, Mutual Information, & a Case Study
Unit 3: Entropy rate, languages and multidimensional data
Unit 3: Redundancy, noise, and biological information

Recently uploaded (20)

PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Lesson notes of climatology university.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Basic Mud Logging Guide for educational purpose
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Pre independence Education in Inndia.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Complications of Minimal Access Surgery at WLH
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Anesthesia in Laparoscopic Surgery in India
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Computing-Curriculum for Schools in Ghana
Renaissance Architecture: A Journey from Faith to Humanism
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
2.FourierTransform-ShortQuestionswithAnswers.pdf
Lesson notes of climatology university.
Final Presentation General Medicine 03-08-2024.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Basic Mud Logging Guide for educational purpose
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Pre independence Education in Inndia.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Complications of Minimal Access Surgery at WLH
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Anesthesia in Laparoscopic Surgery in India

A Numerical Method for the Evaluation of Kolmogorov Complexity, An alternative to lossless compression algorithms

  • 1. A Numerical Method for the Evaluation of Kolmogorov Complexity Hector Zenil Amphith´ˆtre Alan M. Turing ea Laboratoire d’Informatique Fondamentale de Lille (UMR CNRS 8022) Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 1 / 39
  • 2. Foundational Axis As pointed out by Greg Chaitin (thesis report of H. Zenil): The theory of algorithmic complexity is of course now widely accepted, but was initially rejected by many because of the fact that algorithmic complexity is on the one hand uncomputable and on the other hand dependable on the choice of universal Turing machine. This last drawback is specially restrictive for real world applications because the dependency is specially true for short strings, and a solution to this problem is at the core of this work. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 2 / 39
  • 3. Foundational Axis (cont.) The foundational departure point of the thesis is based in a rather but apparent contradiction, pointed out by Greg Chaitin (same thesis report): ... the fact that algorithmic complexity is extremely, dare I say violently, uncomputable, but nevertheless often irresistible to apply ... Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 3 / 39
  • 4. Algorithmic Complexity Foundational Notion A string is random if it is hard to describe. A string is not random if it is easy to describe. Main Idea The theory of computation replaces descriptions with programs. It constitutes the framework of algorithmic complexity: description ⇐⇒ computer program Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 4 / 39
  • 5. Algorithmic Complexity (cont.) Definition [Kolmogorov(1965), Chaitin(1966)] K (s) = min{|p|, U(p) = s} The algorithmic complexity K (s) of a string s is the length of the shortest program p that produces s running on a universal Turing machine U. The formula conveys the following idea: a string with low algorithmic complexity is highly compressible, as the information that it contains can be encoded in a program much shorter in length than the length of the string itself. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 5 / 39
  • 6. Algorithmic Randomness Example The string 010101010101010101 has low algorithmic complexity because it can be described as 18 times 01, and no matter how long it grows in length, if the pattern repeats the description (k times 01) increases only by about log (k), remaining much shorter than the length of the string. Example The string 010010110110001010 has high algorithmic complexity because it doesn’t seem to allow a (much) shorter description other than the string itself, so a shorter description may not exist. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 6 / 39
  • 7. Example of an evaluation of K The string 01010guatda.com/cmx.p101...01 can be produced by the following program: Program A: 1: n:= 0 2: Print n 3: n:= n+1 mod 2 4: Goto 2 The length of A (in bits) is an upper bound of K (010guatda.com/cmx.p101...01). Connections to predictability: The program A trivially allows a shortcut to the value of an arbitrary digit through the following function f(n): if n = 2m then f (n) = 1, f (n) = 0 otherwise. Predictability characterization (Shnorr) [Downey(2010)] simple ⇐⇒ predictable random ⇐⇒ unpredictable Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 7 / 39
  • 8. Noncomputability of K The main drawback of K is that it is not computable and thus can only be approximated in practice. Important No algorithm can tell whether a program p generating s is the shortest (due to the undecidability of the halting problem of Turing machines). No absolute notion of randomness It is impossible to prove that a program p generating s is the shortest possible, also implying that if a program is about the length of the original string one cannot tell whether there is a shorter program producing s. Hence, there is no way to declare a string to be truly algorithmic random. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 8 / 39
  • 9. Structure vs. randomness Formal notion of structure One can exhibit, however, a short program generating s (much) shorter than s itself. So even though one cannot tell whether a string is random one can declare s not random if a program generating s is (much) shorter than the length of s. As a result, one can only find upper bounds of K and s cannot be more complex than the length of that shortest known program producing s. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 9 / 39
  • 10. Most strings have maximal algorithmic complexity Even if one cannot tell when a string is truly random it is known most strings cannot have much shorter generating programs by a simple combinatoric argument: There are exactly 2n bit strings of length n, But there are only 20 + 21 + 22 + . . . + 2(n−1) = 2n − 1 bit strings of fewer bits. (in fact there is one that cannot be compressed even by a single bit) Hence, there are considerably less short programs than long programs. Basic notion One can’t pair-up all n-length strings with programs of much shorter length (there simply aren’t enough short programs to encode all longer strings). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 10 / 39
  • 11. The choice of U matters A major criticism brought forward against K is its dependence of universal Turing machine U. From the definition: K (s) = min{|p|, U(p) = s} It may turn out that: KU1 (s) = KU2 (s) when evaluated respectively using U1 and U2 . Basic notion This dependency is particularly troubling for short strings, shorter than for example the length of the universal Turing machine on which K of the string is evaluated (typically in the order of hundreds of bits as originally suggested by Kolmogorov himself). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 11 / 39
  • 12. The Invariance theorem A theorem guarantees that in the long term different algorithmic complexity evaluations will converge to the same values as the length of the strings grow. Theorem Invariance theorem If U1 and U2 are two (universal) Turing machines and KU1 (s) and KU2 (s) the algorithmic complexity of a binary string s when U1 or U2 are used respectively, there exists a constant c such that for all binary string s: |KU1 (s) − KU2 (s)| < c (think of a compiler between 2 programming languages) Yet, the additive constant can be arbitrarily large, making unstable (if not impossible) to evaluate K (s) for short strings. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 12 / 39
  • 13. Theoretical holes 1 Finding a stable framework for calculating the complexity of short strings (one wants to have short strings like 000...0 to be always among the less algorithmic random despite any choice of machine. 2 Pathological cases: Theory says that a single bit has maximal random complexity because the greatest possible compression is evidently the bit itself (paradoxically it is the only finite string for which one can be sure it cannot be compressed further), yet one would intuitively say that a single bit is among the simplest strings. We try to fill these holes by introducing the concept of algorithmic probability as an alternative evaluation tool for calculating K (s). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 13 / 39
  • 14. Algorithmic Probability There is a measure that describes the expected output of a random program running on a universal Turing machine. Definition [Levin(1977)] m(s) = Σp:U(p)=s 1/2|p| i.e. the sum over all the programs for which U (a prefix free universal Turing machine) with p outputs the string s and halts. m is traditionally called Levin’s semi-measure, Solomonof-Levin’s semi-measure or the Universal distribution [Kirchherr and Li(1997)]. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 14 / 39
  • 15. The motivation for Solomonoff-Levin’s m(s) Borel’s typewriting monkey metaphor1 is useful to explain the intuition behind m(s): If you were going to produce the digits of a mathematical constant like π by throwing digits at random, you would have to produce every digit of its infinite irrational decimal expansion. If you place a monkey on a typewriter (with say a 50 keys typewriter), the probability of the monkey typing an initial segment of 2400 digits of π by chance is (1/502400 ). 1´ Emile Borel (1913) “M´canique Statistique et Irr´versibilit´” and (1914) e e e “Le hasard”. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 15 / 39
  • 16. The motivation for Solomonoff-Levin’s m(s) (cont.) But if instead, the monkey is placed on a computer, the chances of producing a program generating the digits of π are of only 1/50158 because it would take the monkey only 158 characters to produce the first 2400 digits of π using, for example, this C language code: int a = 10000, b, c = 8400, d, e, f[8401], g; main(){for(; b-c; ) f[b + +] = a/5; for(; d = 0, g = c ∗ 2; c- = 14, printf(“%.4d”, e + d/a), e = d%a)for(b = c; d+ = f[b] ∗ a, f[b] = d%–g, d/ = g–, –b; d∗ = b); Implementations in any programming language, of any of the many known formulae of π are shorter than the expansions of π and have therefore greater chances to be produced by chance than producing the digits of π one by one. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 16 / 39
  • 17. More formally said Randomly picking a binary string s of length k among all (uniformly distributed) strings of the same length has probability 1/2k . But the probability to find a binary program p producing s (upon halting), among binary programs running on a Turing machine U is at least 1/2|p| such that U(p) = s (we know that such a program exists because U is a universal Turing machine) Because |p| ≤ k (e.g. the example for π described before), a string s with a short generating program will have greater chances to have been produced by p rather than by writing down all k bits of s one by one. The less random a string the more likely to be produced by a short program. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 17 / 39
  • 18. Towards a semi-measure However, there is an infinite number of programs producing s, so the probability of picking a program producing s among all possible programs is ΣU(p)=s 1/2|p| , the sum of all the programs producing s running on the universal Turing machine U. Nevertheless, for a measure to be a probability measure, the sum of all possible events should add up 1. So ΣU(p)=s 1/2|p| cannot be a probability measure given that there is an infinite number of programs contributing to the overall sum. For example, the following two programs 1 and 2 produce the string 0. 1: Print 0 and: 1: Print 0 2: Print 1 3: Erase the previous 1 and there are (countably) infinitely many more. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 18 / 39
  • 19. Towards a semi-measure (cont.) So for m(s) to be a probability measure, the universal Turing machine U has to be a prefix-free Turing machine, that is a machine that does not accept as a valid program one that has another valid program in its beginning, e.g. program 2 starts with program 1, so if program 1 is a valid program then program 2 cannot be a valid one. The set of valid programs is said to form a prefix-free set, that is no element is a prefix of any other, a property necessary to keep 0 < m(s) < 1. For more details see (Kraft’s inequality [Calude(2002)]). However, some programs halt or some others don’t (actually, most do not halt), so one can only run U and see what programs produce s contributing to the sum. It is said then, that m(s) is semi-computable from below, and therefore is considered a probability semi-measure (as opposed to a full measure). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 19 / 39
  • 20. Some properties of m(s) Solomonoff and Levin proved that, in absence of any other information, m(s) dominates any other semi-measure and is, therefore, optimal in this sense (hence also its universal adjective). On the other hand, the greatest contributor in the summation of programs ΣU(p)=s 1/2|p| is the shortest program p, given that it is when the denominator 2|p| reaches its smallest value and therefore 1/2|p| its greatest value. The shortest program p producing s is nothing but K (s), the algorithmic complexity of s. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 20 / 39
  • 21. The coding theorem The greatest contributor in the summation of programs ΣU(p)=s 1/2|p| is the shortest program p, given that it is when the denominator 2|p| reaches its smallest value and therefore 1/2|p| its greatest value. The shortest program p producing s is nothing but K (s), the algorithmic complexity of s. The coding theorem [Levin(1977), Calude(2002)] describes this connection between m(s) and K (s): Theorem K (s) = −log2 (m(s)) + c Notice that the coding theorem reintroduces an additive constant! One may not get rid of it, but the choices related to m(s) are much less arbitrary than picking a universal Turing machine directly for K (s). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 21 / 39
  • 22. An additive constant in exchange for a massive computation The trade-off this is, however, that the calculation of m(s) requires an extraordinary power of computation. As pointed out by J.-P. Delahaye concerning our method (Pour La Science, No. 405 July 2011 issue): Comme les dur´es ou les longueurs tr`s petites, les faibles e e complexit´s sont d´licates ` ´valuer. Paradoxalement, les e e ae m´thodes d’´valuation demandent des calculs colossaux. e e The first description of our approach was published in Greg Chaitin’s festchrift volume for his 60th. anniversary: J-P. Delahaye & H. Zenil, “On the Kolmogorov-Chaitin complexity for short sequences,” Randomness and Complexity: From Leibniz to Chaitin, edited by C.S. Calude, World Scientific, 2007. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 22 / 39
  • 23. Calculating an experimental m Main idea To evaluate K (s) one can calculate m(s). m(s) is more stable than K (s) because one makes less arbitrary choices on a Turing machine U. Definition D(n) = the function that assigns to every finite binary string s the quotient: (# of times that a machine (n,2) produces s) / (# of machines in (n,2)). D(n) is the probability distribution of the strings produced by all n-state 2-symbol Turing machines (denoted by (n,2)). Examples for n = 1, n = 2 (normalized by the # of machines that halt) D(1) = 0 → 0.5; 1 → 0.5 D(2) = 0 → 0.328; 1 → 0.328; 00 → .0834 . . . Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 23 / 39
  • 24. Calculating an experimental m (cont.) Definition [T. Rad´(1962)] o A busy beaver is a n-state, 2-color Turing machine which writes a maximum number of 1s before halting or performs a maximum number of steps when started on an initially blank tape before halting. Given that the Busy Beaver function values are known for n-state 2-symbol Turing machines for n = 2, 3, 4 we could compute D(n) for n = 2, 3, 4. We ran all 22 039 921 152 two-way tape Turing machines starting with a tape filled with 0s and 1s in order to calculate D(4)2 Theorem D(n) is noncomputable (by reduction to Rado’s Busy Beaver problem). 2 A 9-day calculation on a single 2.26 Core Duo Intel CPU. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 24 / 39
  • 25. Complexity Tables Table: The 22 bit-strings in D(2) from 6 088 (2,2)-Turing machines that halt. [Delahaye and Zenil(2011)] 0 → .328 010 → .00065 1 → .328 101 → .00065 00 → .0834 111 → .00065 01 → .0834 0000 → .00032 10 → .0834 0010 → .00032 11 → .0834 0100 → .00032 001 → .00098 0110 → .00032 011 → .00098 1001 → .00032 100 → .00098 1011 → .00032 110 → .00098 1101 → .00032 000 → .00065 1111 → .00032 Solving degenerate cases “0” is the simplest string (together with “1”) according to D. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 25 / 39
  • 26. Partial D(4) (top strings) Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 26 / 39
  • 27. From a Prior to an Empirical Distribution We see algorithmic complexity emerging: 1 The classification goes according to our intuition of what complexity should be. 2 Strings are almost always classified by length except in cases in which intuition justifies they should not. For ex. even though 0101010 is of length 7, it came better ranked than some strings of length shorter than 7. One sees emerging the low random complexity of 010101... as a simple string. From m to D Unlike m, D is an empirical distribution and no longer a prior. D experimentally confirms the intuition behind Solomonoff and Levin’s measure. Full tables are available online: www.algorithmicnature.org Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 27 / 39
  • 28. Miscellaneous facts from D(3) and D(4) There are 5 970 768 960 machines that halt among the 22 039 921 152 in (4,2). That is a fraction of 0.27 halt. Among the most random looking group strings from D(4) there are : 0, 00, 000..., 01, 010, 0101, etc. Among the most random looking strings one can find: 1101010101010101, 1101010100010101, 1010101010101011 and 1010100010101011, each with frequency of 5.4447×10−10 . As in D(3), where we reported that one string group (0101010 and its reversion) climbed positions, in D(4) 399 strings climbed to the top and were not sorted among their length groups. In D(4) string length was no longer a classification determinant. For example, between positions 780 and 790, string lengths are: 11, 10, 10, 11, 9, 10, 9, 9, 9, 10 and 9 bits. D(4) preserves the string order of D(3) except in 17 places out of 128 strings in D(3) ordered from highest to lowest string frequency. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 28 / 39
  • 29. Connecting D back to m To get m we replaced a uniform distribution of bits composing strings to a uniform distribution bits composing programs. Imagine that your (Turing-complete) programming language allows a monkey to produce rules of Turing machines at random, every time that the monkey types a valid program it is executed. At the limit, the monkey (which is just a random source of programs) will end up covering a sample of the space of all possible Turing machine rules. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 29 / 39
  • 30. Connecting D back to m On the other hand, D(n) for a fixed n is the result of running all n-state 2-symbol Turing machines according to an enumeration. An enumeration is just a thorough sample of the space of all n-state 2-symbol Turing machines each with fixed probability 1/(# of Turing machines in (n,2)) (by definition of enumeration). D(n) is therefore, a legitimate programmer monkey experiment. The additional advantage of performing a thorough sample of Turing machines by following an enumeration is that the order in which the machines are traversed in the enumeration is irrelevant as long as one covers all the elements of a (n,2) space. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 30 / 39
  • 31. Connecting D back to m (cont.) One may ask why shorter programs are favored. The answer, in analogy to the monkey experiment, is based on the uniform random distribution of keystrokes: programs cannot be that long without eventually containing the ending program keystroke. One can still think that one can impose a different distribution of the program instructions, for ex. changing the keyboard distribution repeating certain keys. Choices other than the uniform are more arbitrary than just assuming no additional information, and therefore a uniform distribution (a keyboard with two or more letter “a”’s rather than the usual one seems more arbitrary than having a key per letter). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 31 / 39
  • 32. Connecting D back to m (cont.) Every D(n) is a sample of D(n + 1) because (n + 1, 2) contains all machines in (n, 2). We have empirically tested that strings sorted by frequency in D(4) preserve the order of D(3) which preserves the order of D(2), meaning that longer programs do not produce completely different classifications. One can think of the sequence D(1), D(2), D(3), D(4), . . . as samples which values are approximations to m. One may also ask, how can we know whether a monkey provided with a different programming language would produce a completely different D, and therefore yet another experimental version of m. That may be the case, but we have also shown that reasonable programming languages (e.g. based on cellular automata and Post tag systems) produce reasonable (correlated) distributions. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 32 / 39
  • 33. Connecting D back to m (cont.) Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 33 / 39
  • 34. m(s) provides a formalization for Occam’s razor The immediate consequence of algorithmic probability is simple but powerful (and surprising): Basic notion Type-writing monkeys (Borel) garbage in → garbage out Programmer monkeys: (Bennett, Chaitin) garbage in → structure out Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 34 / 39
  • 35. What m(s) may tell us about the physical world? Basic notion m(s) tells that it is unlikely that a Rube Goldberg machine produces a string if the string can be produced by a much simpler process. Physical hypothesis m(s) would tell that, if processes in the world are computer-like, it is unlikely that structures are the result of the computation of a Rube Goldberg machine. Instead, they would rather be the result of the shortest programs producing that structures and patterns would follow the distribution suggested by m(s). Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 35 / 39
  • 36. On the algorithmic nature of the world Could it be that m(s) tells us how structure in the world has come to be and how is it distributed all around? Could m(s) reveal the machinery behind? What happens in the world is often the result of an ongoing (mechanical) process (e.g. the Sun rising due to the mechanical celestial dynamics of the solar system). Can m(s) tell something about the distribution of patterns in the world? We decided to see so we got some empirical datasets from the physical world and made a comparison against data produced by pure computation that by definition should follow m(s). The results were published in H. Zenil & J-P. Delahaye, “On the Algorithmic Nature of the World”, in G. Dodig-Crnkovic and M. Burgin (eds), Information and Computation, World Scientific, 2010. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 36 / 39
  • 37. On the algorithmic nature of the world Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 37 / 39
  • 38. Conclusions Our method aimed to show that reasonable choices of formalisms for evaluating the complexity of short strings through m(s) give consistent measures of algorithmic complexity. [Greg Chaitin (w.r.t our method)] ...the dreaded theoretical hole in the foundations of algorithmic complexity turns out, in practice, not to be as serious as was previously assumed. Our method also seems notable in that it is an experimental approach that comes into the rescue of the apparent holes left by the theory. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 38 / 39
  • 39. Bibliography C.S. Calude, Information and Randomness: An Algorithmic Perspective (Texts in Theoretical Computer Science. An EATCS Series), Springer, 2nd. edition, 2002. G. J. Chaitin. On the length of programs for computing finite binary sequences. Journal of the ACM, 13(4):547–569, 1966. G. Chaitin, Meta Math!, Pantheon, 2005. R.G. Downey and D. Hirschfeldt, Algorithmic Randomness and Complexity, Springer Verlag, to appear, 2010. J.P. Delahaye and H. Zenil, On the Kolmogorov-Chaitin complexity for short sequences, in Cristian Calude (eds) Complexity and Randomness: From Leibniz to Chaitin. World Scientific, 2007. J.P. Delahaye and H. Zenil, Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness, arXiv:1101.4795v4 [cs.IT]. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39
  • 40. C.S. Calude, M.A. Stay, Most Programs Stop Quickly or Never Halt, 2007. W. Kirchherr and M. Li, The miraculous universal distribution, Mathematical Intelligencer , 1997. A. N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information and Transmission, 1(1):1–7, 1965. P. Martin-L¨f. The definition of random sequences. Information and o Control, 9:602–619, 1966. L. Levin, On a concrete method of Assigning Complexity Measures, Doklady Akademii nauk SSSR, vol.18(3), pp. 727-731, 1977. L. Levin., Universal Search Problems., 9(3):265-266, 1973. (submitted: 1972, reported in talks: 1971). English translation in: B.A.Trakhtenbrot. A Survey of Russian Approaches to Perebor (Brute-force Search) Algorithms. Annals of the History of Computing 6(4):384-400, 1984. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39
  • 41. M. Li, P. Vit´nyi, An Introduction to Kolmogorov Complexity and Its a Applications,, Springer, 3rd. Revised edition, 2008. S. Lloyd, Programming the Universe: A Quantum Computer Scientist Takes On the Cosmos, Knopf Publishing Group, 2006. T. Rad´, On non-computable functions, Bell System Technical o Journal, Vol. 41, No. 3, 1962. R. J. Solomonoff. A formal theory of inductive inference: Parts 1 and 2. Information and Control, 7:1–22 and 224–254, 1964. H. Zenil and J.P. Delahaye, On the Algorithmic Nature of the World, in G. Dodig-Crnkovic and M. Burgin (eds), Information and Computation, World Scientific, 2010. S. Wolfram, A New Kind of Science, Wolfram Media, 2002. Hector Zenil (LIFL) A Numerical Method for the Evaluation of Kolmogorov Complexity 39 / 39