Namespaces
Variants
Actions

Difference between revisions of "Algorithmic information theory"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
(→‎References: latexify)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
<!--
 +
a0118401.png
 +
$#A+1 = 149 n = 1
 +
$#C+1 = 149 : ~/encyclopedia/old_files/data/A011/A.0101840 Algorithmic information theory
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
The branch of mathematical logic which gives a more exact formulation of the basic concepts of the theory of information, based on the concepts of an [[Algorithm|algorithm]] and a [[Computable function|computable function]]. Algorithmic information theory attempts to give a base to these concepts without recourse to probability theory, so that the concepts of entropy and quantity of information might be applicable to individual objects. It also gives rise to its own problems, which are related to the study of the entropy of specific individual objects.
 
The branch of mathematical logic which gives a more exact formulation of the basic concepts of the theory of information, based on the concepts of an [[Algorithm|algorithm]] and a [[Computable function|computable function]]. Algorithmic information theory attempts to give a base to these concepts without recourse to probability theory, so that the concepts of entropy and quantity of information might be applicable to individual objects. It also gives rise to its own problems, which are related to the study of the entropy of specific individual objects.
  
The key concept of algorithmic information theory is that of the entropy of an individual object, also called the (Kolmogorov) complexity of the object. The intuitive meaning of this concept is the minimum amount of information needed to reconstruct a given object. An exact definition of the concept of complexity of an individual object, and (on the basis of this concept) of the concept of the quantity of information in an individual object <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118401.png" /> relative to an individual object <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118402.png" />, was given by A.N. Kolmogorov in 1962–1965, after which the development of algorithmic information theory began.
+
The key concept of algorithmic information theory is that of the entropy of an individual object, also called the (Kolmogorov) complexity of the object. The intuitive meaning of this concept is the minimum amount of information needed to reconstruct a given object. An exact definition of the concept of complexity of an individual object, and (on the basis of this concept) of the concept of the quantity of information in an individual object $  y $
 +
relative to an individual object $  x $,  
 +
was given by A.N. Kolmogorov in 1962–1965, after which the development of algorithmic information theory began.
 +
 
 +
Without loss of generality, only binary words (linear strings of 0's and 1's) are considered. The complexity of a word  $  x $
 +
is understood to mean the length  $  l(p) $
 +
of the shortest binary word  $  p $
 +
which contains all the information required to reconstruct  $  x $
 +
with the aid of some definite decoding procedure. A precise definition of this concept follows.
 +
 
 +
Let  $  F(s) $
 +
be an arbitrary universal [[Partial recursive function|partial recursive function]]. Then the complexity of the word  $  x $
 +
with respect to  $  F $
 +
will be
  
Without loss of generality, only binary words (linear strings of 0's and 1's) are considered. The complexity of a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118403.png" /> is understood to mean the length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118404.png" /> of the shortest binary word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118405.png" /> which contains all the information required to reconstruct <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118406.png" /> with the aid of some definite decoding procedure. A precise definition of this concept follows.
+
$$
 +
K _ {F} ( x ) = \
 +
\left \{
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118407.png" /> be an arbitrary universal [[Partial recursive function|partial recursive function]]. Then the complexity of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118408.png" /> with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a0118409.png" /> will be
+
\begin{array}{ll}
 +
\mathop{\rm min}  l ( p )  & \textrm{ if }  F ( p ) = x ,  \\
 +
\infty  &\
 +
\textrm{ if }  \textrm{ there }  \textrm{ is }  \textrm{ no }  p \
 +
\textrm{ such }  \textrm{ that }  F(p) = x. \\
 +
\end{array}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184010.png" /></td> </tr></table>
+
\right .$$
  
The word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184011.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184012.png" /> is known as the code, or the program, by which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184013.png" /> reconstructs the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184014.png" />. The following theorem is valid: There exists a partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184015.png" /> (an optimal function), such that for any partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184016.png" /> the following inequality holds:
+
The word $  p $
 +
such that $  F(p) = x $
 +
is known as the code, or the program, by which $  F $
 +
reconstructs the word $  x $.  
 +
The following theorem is valid: There exists a partial recursive function $  F _ {0} $(
 +
an optimal function), such that for any partial recursive function $  F $
 +
the following inequality holds:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184017.png" /></td> </tr></table>
+
$$
 +
K _ {F _ {0}  } ( x )  \leq  K _ {F} ( x ) + C _ {F} ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184018.png" /> is a constant which is independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184019.png" />. This theorem yields an invariant (up to an additive constant) definition of complexity: The complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184020.png" /> of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184021.png" /> is the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184022.png" /> with respect to some partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184023.png" />, which has been fixed once and for all. Clearly, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184024.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184025.png" /> is a constant independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184026.png" />.
+
where $  C _ {F} $
 +
is a constant which is independent of $  x $.  
 +
This theorem yields an invariant (up to an additive constant) definition of complexity: The complexity $  K(x) $
 +
of the word $  x $
 +
is the complexity $  K _ {F _ {0}  } (x) $
 +
with respect to some partial recursive function $  F _ {0} $,  
 +
which has been fixed once and for all. Clearly, $  K (x) \leq  l(x) + C $,  
 +
where $  C $
 +
is a constant independent of $  x $.
  
Using <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184027.png" /> one may also determine the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184028.png" /> of pairs of words <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184029.png" />; to do this, the pairs <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184030.png" /> are effectively coded as one word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184031.png" />, and the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184032.png" /> is understood to mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184033.png" />.
+
Using $  K(x) $
 +
one may also determine the complexity $  K(x, y) $
 +
of pairs of words $  ( x, y ) $;  
 +
to do this, the pairs $  (x, y) $
 +
are effectively coded as one word $  c (x, y) $,  
 +
and the complexity $  K(x, y) $
 +
is understood to mean $  K (c (x, y)) $.
  
P. Martin-Löf studied the behaviour of the complexity of the initial segments of infinite binary sequences. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184034.png" /> denote the initial segment of a sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184035.png" />, consisting of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184036.png" /> initial characters. Then almost all (according to the Lebesgue measure) infinite binary sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184037.png" /> have the following properties: a) for infinitely many natural numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184038.png" /> the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184039.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184040.png" /> is some constant, is valid; and b) for all natural numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184041.png" /> larger than some value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184043.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184044.png" /> is an arbitrary positive number (the Martin-Löf theorem). Conversely, for any sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184045.png" /> there exist infinitely many <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184046.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184047.png" />.
+
P. Martin-Löf studied the behaviour of the complexity of the initial segments of infinite binary sequences. Let $  \omega _ {n} $
 +
denote the initial segment of a sequence $  \omega $,  
 +
consisting of the $  n $
 +
initial characters. Then almost all (according to the Lebesgue measure) infinite binary sequences $  \omega $
 +
have the following properties: a) for infinitely many natural numbers $  n $
 +
the inequality $  K ( \omega _ {n} ) \geq  n - c $,  
 +
where $  c $
 +
is some constant, is valid; and b) for all natural numbers $  n $
 +
larger than some value $  n _  \epsilon  $,  
 +
$  K ( \omega _ {n} ) \geq  n - (1 + \epsilon )  \mathop{\rm log} _ {2}  n $,  
 +
where $  \epsilon $
 +
is an arbitrary positive number (the Martin-Löf theorem). Conversely, for any sequence $  \omega $
 +
there exist infinitely many $  n $
 +
such that $  K ( \omega _ {n} ) \leq  n -  \mathop{\rm log} _ {2}  n $.
  
The concept of complexity is also employed in the study of algorithmic problems. Here, a more natural concept is the so-called complexity of solution of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184048.png" />. The intuitive meaning of this concept is the minimum quantity of information required to find, for each number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184049.png" />, the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184050.png" />-th character of word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184051.png" /> (the length of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184052.png" /> need not necessarily be known). More exactly, the complexity of solution of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184053.png" /> by a partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184054.png" /> means that
+
The concept of complexity is also employed in the study of algorithmic problems. Here, a more natural concept is the so-called complexity of solution of the word $  x $.  
 +
The intuitive meaning of this concept is the minimum quantity of information required to find, for each number $  i \leq  l(x) $,  
 +
the $  i $-
 +
th character of word $  x $(
 +
the length of the word $  x $
 +
need not necessarily be known). More exactly, the complexity of solution of the word $  x = x _ {1} \dots x _ {l(x)} $
 +
by a partial recursive function $  G (s, t) $
 +
means that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184055.png" /></td> </tr></table>
+
$$
 +
K R _ {G} ( x )  = \
 +
\left \{
  
The following theorem is valid: There exists a partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184056.png" /> (an optimal function) such that any other partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184057.png" /> satisfies the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184058.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184059.png" /> is a constant independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184060.png" />. The complexity of solution, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184061.png" />, of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184062.png" /> is the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184063.png" /> of some optimal partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184064.png" /> which has been fixed once and for all. Clearly, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184065.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184066.png" />. It is possible, by using <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184067.png" />, to determine, for any set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184068.png" /> of natural numbers, the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184069.png" /> for the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184070.png" />-segment of the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184071.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184072.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184073.png" /> is the characteristic sequence of the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184074.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184075.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184076.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184077.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184078.png" />).
+
\begin{array}{lt}
 +
\mathop{\rm min} \{ {l ( p ) } : {\forall i \leq  l(x) , G ( p , i ) = x _ {i} } \}
 +
, &\infty \  \textrm{ if }  \textrm{ there }  \textrm{ is }  \textrm{ no }  \textrm{ such }  p , \\
 +
\end{array}
  
Algorithmic problems are usually represented as problems of belonging to some recursively enumerable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184079.png" />. If some value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184080.png" /> is given, and the problem of belonging to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184081.png" /> is restricted to the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184082.png" /> natural numbers only, then one obtains a bounded algorithmic problem. The magnitude <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184083.png" /> may be intuitively understood as expressing the quantity of information required to solve this bounded problem. In particular, the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184084.png" /> is recursive if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184085.png" /> is bounded from above by some constant.
+
\right .$$
  
It follows from the Martin-Löf theorem that there exist sets for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184086.png" />. It was also found that sets with maximum degree of complexity exist even among sets defined by arithmetic predicates with two quantifiers. However, the following theorem applies to recursively enumerable sets: 1) for any recursively enumerable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184087.png" /> and any value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184088.png" />, the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184089.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184090.png" /> does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184091.png" />, is valid; and 2) there exists a recursively enumerable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184092.png" /> such that for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184093.png" /> the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184094.png" /> holds. At the same time there exists a recursively enumerable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184095.png" /> such that if the time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184096.png" /> of computation of an arbitrary general recursive function is bounded, the estimate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184097.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184098.png" /> does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a01184099.png" />, is valid.
+
The following theorem is valid: There exists a partial recursive function  $  G _ {0} $(
 +
an optimal function) such that any other partial recursive function  $  G $
 +
satisfies the inequality $  KR _ {G _ {0}  } (x) \leq  KR _ {G} ( x ) + C _ {G} $,
 +
where  $  C _ {G} $
 +
is a constant independent of  $  x $.  
 +
The complexity of solution, $  KR (x) $,
 +
of the word  $  x $
 +
is the complexity  $  KR _ {G _ {0}  } (x) $
 +
of some optimal partial recursive function  $  G _ {0} $
 +
which has been fixed once and for all. Clearly, $  KR (x) \leq  K (x) + C $
 +
and $  K(x) \leq  KR (x) + 2 l (x) + C $.  
 +
It is possible, by using  $  KR (x) $,
 +
to determine, for any set  $  M $
 +
of natural numbers, the complexity  $  K (M, n) $
 +
for the $  n $-
 +
segment of the set $  M $:
 +
$  K (M, n) = KR ( \omega _ {n} ) $,
 +
where  $  \omega = x _ {1} \dots x _ {i} \dots $
 +
is the characteristic sequence of the set  $  M $(
 +
$  x _ {i} = 1 $
 +
if  $  i \in M $,  
 +
and  $  x _ {i} = 0 $
 +
if  $  i \notin M $).
  
Sets which are universal with respect to a certain type of reducibility may also be defined in these terms (cf. [[Universal set|Universal set]]): A set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840100.png" /> is weakly tabularly universal if and only if there exists an unbounded general recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840101.png" /> such that the inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840102.png" /> is valid for all values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840103.png" />.
+
Algorithmic problems are usually represented as problems of belonging to some recursively enumerable set  $  M $.
 +
If some value  $  n $
 +
is given, and the problem of belonging to  $  M $
 +
is restricted to the first  $  n $
 +
natural numbers only, then one obtains a bounded algorithmic problem. The magnitude  $  K (M, n) $
 +
may be intuitively understood as expressing the quantity of information required to solve this bounded problem. In particular, the set  $  M $
 +
is recursive if and only if  $  K(M, n) $
 +
is bounded from above by some constant.
 +
 
 +
It follows from the Martin-Löf theorem that there exist sets for which  $  K (M, n) \sim n $.
 +
It was also found that sets with maximum degree of complexity exist even among sets defined by arithmetic predicates with two quantifiers. However, the following theorem applies to recursively enumerable sets: 1) for any recursively enumerable set  $  M $
 +
and any value  $  n $,
 +
the inequality  $  K ( M, n) \leq  \mathop{\rm log} _ {2}  n + C $,
 +
where  $  C $
 +
does not depend on  $  n $,
 +
is valid; and 2) there exists a recursively enumerable set  $  M $
 +
such that for any  $  n $
 +
the inequality  $  K (M,n) >  \mathop{\rm log} _ {2}  n $
 +
holds. At the same time there exists a recursively enumerable set  $  M $
 +
such that if the time  $  t $
 +
of computation of an arbitrary general recursive function is bounded, the estimate  $  K  ^ {t} (M, n) \geq  n/ c _ {t} $,
 +
where  $  c _ {t} $
 +
does not depend on  $  n $,
 +
is valid.
 +
 
 +
Sets which are universal with respect to a certain type of reducibility may also be defined in these terms (cf. [[Universal set|Universal set]]): A set $  M $
 +
is weakly tabularly universal if and only if there exists an unbounded general recursive function $  f(n) $
 +
such that the inequality $  K (M, n) \geq  f(n) $
 +
is valid for all values of $  n $.
  
 
In studying the complexity of bounded algorithmic problems other measures of complexity may also be employed, e.g. the minimum of the length of the coding of the [[Normal algorithm|normal algorithm]] which solves the bounded problem in question. It was found, however, that asymptotically exact relations (cf. [[Algorithm, complexity of description of an|Algorithm, complexity of description of an]]) exist between the complexities introduced above and the analogues of these complexities expressed as the minimum length of the coding of an algorithm (or as the minimum number of internal states of a Turing machine).
 
In studying the complexity of bounded algorithmic problems other measures of complexity may also be employed, e.g. the minimum of the length of the coding of the [[Normal algorithm|normal algorithm]] which solves the bounded problem in question. It was found, however, that asymptotically exact relations (cf. [[Algorithm, complexity of description of an|Algorithm, complexity of description of an]]) exist between the complexities introduced above and the analogues of these complexities expressed as the minimum length of the coding of an algorithm (or as the minimum number of internal states of a Turing machine).
  
Another concept which is introduced in constructing algorithmic information theory is that of conditional complexity of a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840104.png" />, for a known <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840105.png" />, using a partial recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840106.png" />:
+
Another concept which is introduced in constructing algorithmic information theory is that of conditional complexity of a word $  x $,  
 +
for a known $  y $,  
 +
using a partial recursive function $  G (s, t) $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840107.png" /></td> </tr></table>
+
$$
 +
K _ {G} ( x \mid  y )  = \
 +
\left \{
  
This concept is related to the theorem on the existence of an optimal function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840108.png" />, and the conditional complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840109.png" /> of a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840110.png" />, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840111.png" /> is known, is defined as the complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840112.png" /> using an optimal function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840113.png" /> fixed once and for all. The intuitive meaning of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840114.png" /> is the minimum quantity of information which must be added to the information contained in the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840115.png" /> for the reconstitution of the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840116.png" /> to be possible. Clearly, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840117.png" />.
+
\begin{array}{lt}
 +
\mathop{\rm min} \{ {l ( p ) } : {G ( p , y ) = x } \}
 +
, &\infty \  \textrm{ if }  \textrm{ there }  \textrm{ is }  \textrm{ no }  p \
 +
\textrm{ such }  \textrm{ that }  G (p, y) = x. \\
 +
\end{array}
  
The next central concept of algorithmic information theory is the concept of the amount of information in an individual object <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840118.png" /> relative to an individual object <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840119.png" />:
+
\right .$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840120.png" /></td> </tr></table>
+
This concept is related to the theorem on the existence of an optimal function  $  G _ {0} $,
 +
and the conditional complexity  $  K (x \mid  y) $
 +
of a word  $  x $,
 +
if  $  y $
 +
is known, is defined as the complexity  $  K _ {G _ {0}  } (x \mid  y) $
 +
using an optimal function  $  G _ {0} $
 +
fixed once and for all. The intuitive meaning of  $  K(x \mid  y) $
 +
is the minimum quantity of information which must be added to the information contained in the word  $  y $
 +
for the reconstitution of the word  $  x $
 +
to be possible. Clearly,  $  K (x \mid  y) \leq  K (x) + C $.
  
The quantity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840121.png" /> is called the algorithmic amount of information in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840122.png" /> about <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840123.png" />. The quantities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840124.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840125.png" /> are sometimes called the algorithmic entropy of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840126.png" /> and the algorithmic conditional entropy of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840127.png" /> for a given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840128.png" />. The formulas for the decomposition of the entropy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840129.png" /> and of the commutation of information <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840130.png" /> are valid in the algorithmic concept up to terms of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840131.png" />:
+
The next central concept of algorithmic information theory is the concept of the amount of information in an individual object  $  y $
 +
relative to an individual object  $  x $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840132.png" /></td> </tr></table>
+
$$
 +
I ( y : x )  = K ( x ) - K ( x \mid  y ) .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840133.png" /></td> </tr></table>
+
The quantity  $  I (y:x) $
 +
is called the algorithmic amount of information in  $  y $
 +
about  $  x $.
 +
The quantities  $  K(x) $
 +
and  $  K( x \mid  y ) $
 +
are sometimes called the algorithmic entropy of  $  x $
 +
and the algorithmic conditional entropy of  $  x $
 +
for a given  $  y $.
 +
The formulas for the decomposition of the entropy  $  H(X, Y) = H(X) + H(Y \mid  X) $
 +
and of the commutation of information  $  I(X:Y) = I (Y:X) $
 +
are valid in the algorithmic concept up to terms of order  $  O (  \mathop{\rm log}  H (X, Y) ) $:
  
There exists a certain relation (due to Kolmogorov) between the algorithmic and the classical definitions of the amount of information (or, more exactly, between the complexity of a word according to Kolmogorov and the entropy of the Shannon distribution of frequencies): Let a number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840134.png" /> be given, let the word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840135.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840136.png" /> consist of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840137.png" /> words of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840138.png" />, and let the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840139.png" />-th word of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840140.png" /> appear in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840141.png" /> with a frequency <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840142.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840143.png" />); one then has
+
$$
 +
| K ( x , y ) - [ K ( x ) + K ( y \mid  x ) ] |  \leq  O (  \mathop{\rm log}  K ( x , y ) ) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840144.png" /></td> </tr></table>
+
$$
 +
| I ( x : y ) - I ( y : x ) |  \leq  O (  \mathop{\rm log}  K ( x , y ) ) .
 +
$$
 +
 
 +
There exists a certain relation (due to Kolmogorov) between the algorithmic and the classical definitions of the amount of information (or, more exactly, between the complexity of a word according to Kolmogorov and the entropy of the Shannon distribution of frequencies): Let a number  $  r $
 +
be given, let the word  $  x $
 +
of length  $  i \cdot r $
 +
consist of  $  i $
 +
words of length  $  r $,
 +
and let the  $  k $-
 +
th word of length  $  r $
 +
appear in  $  x $
 +
with a frequency  $  q _ {k} $(
 +
$  k = 1 \dots 2  ^ {r} $);
 +
one then has
 +
 
 +
$$
 +
 
 +
\frac{K ( x ) }{i}
 +
  \leq  H + \alpha ( i ) ,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840145.png" /></td> </tr></table>
+
$$
 +
= - \sum _ {k = 1 } ^ { {2 }  ^ {r} }
 +
q _ {k}  \mathop{\rm log} _ {2}  q _ {k} ,
 +
$$
  
and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840146.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840147.png" />. In the general case no closer connection can be established between entropy and complexity. This is because entropy is intended to study sequences subject to frequency laws only. Consequently, one may establish the following relation for random sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840148.png" /> in measure between the magnitudes under study, which corresponds to independent trials:
+
and $  \alpha (i) \rightarrow 0 $
 +
as $  i \rightarrow \infty $.  
 +
In the general case no closer connection can be established between entropy and complexity. This is because entropy is intended to study sequences subject to frequency laws only. Consequently, one may establish the following relation for random sequences $  \omega $
 +
in measure between the magnitudes under study, which corresponds to independent trials:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840149.png" /></td> </tr></table>
+
$$
 +
\lim\limits _ {i \rightarrow \infty } 
 +
\frac{K ( \omega _ {ir} ) }{i}
 +
  = H .
 +
$$
  
 
Similar considerations apply to an arbitrary ergodic stationary stochastic process.
 
Similar considerations apply to an arbitrary ergodic stationary stochastic process.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.N. Kolmogorov,  "Three approaches to the quantitative definition of information"  ''Problems in Information Transmission'' , '''1'''  (1965)  pp. 1–7  ''Problemy Peredachi Informatsii'' , '''1''' :  1  (1965)  pp. 3–11</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.N. Kolmogorov,  "On the logical foundations of information theory and probability theory"  ''Problems Inform. Transmission'' , '''5''' :  3  (1969)  pp. 1–4  ''Problemy Peredachi Informatsii'' , '''5''' :  3  (1969)  pp. 3–7</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  A.K. Zvonkin,  L.A. Levin,  "The complexity of finite objects and the developments of the concepts of information and randomness by means of the theory of algorithms"  ''Russian Math. Surveys'' , '''25''' :  6  (1970)  pp. 82–124  ''Uspekhi Mat. Nauk'' , '''25''' :  6  (1970)  pp. 85–127</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  Ya.M. Bardzin',  "Complexity of programs to determine whether natural numbers not greater than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a011/a011840/a011840150.png" /> belong to a recursively enumerable set"  ''Soviet Math. Dokl.'' , '''9''' :  5  (1968)  pp. 1251–1254  ''Dokl. Akad. Nauk SSSR'' , '''182''' :  6  (1968)  pp. 1249–1252</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  M.I. Kanovich,  "Complexity of resolution of a recursively enumerable set as a criterion of its universality"  ''Soviet Math. Dokl.'' , '''11''' :  5  (1970)  pp. 1224–1228  ''Dokl. Akad. Nauk SSSR'' , '''194''' :  3  (1970)  pp. 500–503</TD></TR></table>
+
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.N. Kolmogorov,  "Three approaches to the quantitative definition of information"  ''Problems in Information Transmission'' , '''1'''  (1965)  pp. 1–7  ''Problemy Peredachi Informatsii'' , '''1''' :  1  (1965)  pp. 3–11</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.N. Kolmogorov,  "On the logical foundations of information theory and probability theory"  ''Problems Inform. Transmission'' , '''5''' :  3  (1969)  pp. 1–4  ''Problemy Peredachi Informatsii'' , '''5''' :  3  (1969)  pp. 3–7</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  A.K. Zvonkin,  L.A. Levin,  "The complexity of finite objects and the developments of the concepts of information and randomness by means of the theory of algorithms"  ''Russian Math. Surveys'' , '''25''' :  6  (1970)  pp. 82–124  ''Uspekhi Mat. Nauk'' , '''25''' :  6  (1970)  pp. 85–127</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  Ya.M. Bardzin',  "Complexity of programs to determine whether natural numbers not greater than $n$ belong to a recursively enumerable set"  ''Soviet Math. Dokl.'' , '''9''' :  5  (1968)  pp. 1251–1254  ''Dokl. Akad. Nauk SSSR'' , '''182''' :  6  (1968)  pp. 1249–1252</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  M.I. Kanovich,  "Complexity of resolution of a recursively enumerable set as a criterion of its universality"  ''Soviet Math. Dokl.'' , '''11''' :  5  (1970)  pp. 1224–1228  ''Dokl. Akad. Nauk SSSR'' , '''194''' :  3  (1970)  pp. 500–503</TD></TR></table>
 
 
 
 
  
 
====Comments====
 
====Comments====

Latest revision as of 06:44, 26 March 2023


The branch of mathematical logic which gives a more exact formulation of the basic concepts of the theory of information, based on the concepts of an algorithm and a computable function. Algorithmic information theory attempts to give a base to these concepts without recourse to probability theory, so that the concepts of entropy and quantity of information might be applicable to individual objects. It also gives rise to its own problems, which are related to the study of the entropy of specific individual objects.

The key concept of algorithmic information theory is that of the entropy of an individual object, also called the (Kolmogorov) complexity of the object. The intuitive meaning of this concept is the minimum amount of information needed to reconstruct a given object. An exact definition of the concept of complexity of an individual object, and (on the basis of this concept) of the concept of the quantity of information in an individual object $ y $ relative to an individual object $ x $, was given by A.N. Kolmogorov in 1962–1965, after which the development of algorithmic information theory began.

Without loss of generality, only binary words (linear strings of 0's and 1's) are considered. The complexity of a word $ x $ is understood to mean the length $ l(p) $ of the shortest binary word $ p $ which contains all the information required to reconstruct $ x $ with the aid of some definite decoding procedure. A precise definition of this concept follows.

Let $ F(s) $ be an arbitrary universal partial recursive function. Then the complexity of the word $ x $ with respect to $ F $ will be

$$ K _ {F} ( x ) = \ \left \{ \begin{array}{ll} \mathop{\rm min} l ( p ) & \textrm{ if } F ( p ) = x , \\ \infty &\ \textrm{ if } \textrm{ there } \textrm{ is } \textrm{ no } p \ \textrm{ such } \textrm{ that } F(p) = x. \\ \end{array} \right .$$

The word $ p $ such that $ F(p) = x $ is known as the code, or the program, by which $ F $ reconstructs the word $ x $. The following theorem is valid: There exists a partial recursive function $ F _ {0} $( an optimal function), such that for any partial recursive function $ F $ the following inequality holds:

$$ K _ {F _ {0} } ( x ) \leq K _ {F} ( x ) + C _ {F} , $$

where $ C _ {F} $ is a constant which is independent of $ x $. This theorem yields an invariant (up to an additive constant) definition of complexity: The complexity $ K(x) $ of the word $ x $ is the complexity $ K _ {F _ {0} } (x) $ with respect to some partial recursive function $ F _ {0} $, which has been fixed once and for all. Clearly, $ K (x) \leq l(x) + C $, where $ C $ is a constant independent of $ x $.

Using $ K(x) $ one may also determine the complexity $ K(x, y) $ of pairs of words $ ( x, y ) $; to do this, the pairs $ (x, y) $ are effectively coded as one word $ c (x, y) $, and the complexity $ K(x, y) $ is understood to mean $ K (c (x, y)) $.

P. Martin-Löf studied the behaviour of the complexity of the initial segments of infinite binary sequences. Let $ \omega _ {n} $ denote the initial segment of a sequence $ \omega $, consisting of the $ n $ initial characters. Then almost all (according to the Lebesgue measure) infinite binary sequences $ \omega $ have the following properties: a) for infinitely many natural numbers $ n $ the inequality $ K ( \omega _ {n} ) \geq n - c $, where $ c $ is some constant, is valid; and b) for all natural numbers $ n $ larger than some value $ n _ \epsilon $, $ K ( \omega _ {n} ) \geq n - (1 + \epsilon ) \mathop{\rm log} _ {2} n $, where $ \epsilon $ is an arbitrary positive number (the Martin-Löf theorem). Conversely, for any sequence $ \omega $ there exist infinitely many $ n $ such that $ K ( \omega _ {n} ) \leq n - \mathop{\rm log} _ {2} n $.

The concept of complexity is also employed in the study of algorithmic problems. Here, a more natural concept is the so-called complexity of solution of the word $ x $. The intuitive meaning of this concept is the minimum quantity of information required to find, for each number $ i \leq l(x) $, the $ i $- th character of word $ x $( the length of the word $ x $ need not necessarily be known). More exactly, the complexity of solution of the word $ x = x _ {1} \dots x _ {l(x)} $ by a partial recursive function $ G (s, t) $ means that

$$ K R _ {G} ( x ) = \ \left \{ \begin{array}{lt} \mathop{\rm min} \{ {l ( p ) } : {\forall i \leq l(x) , G ( p , i ) = x _ {i} } \} , &\infty \ \textrm{ if } \textrm{ there } \textrm{ is } \textrm{ no } \textrm{ such } p , \\ \end{array} \right .$$

The following theorem is valid: There exists a partial recursive function $ G _ {0} $( an optimal function) such that any other partial recursive function $ G $ satisfies the inequality $ KR _ {G _ {0} } (x) \leq KR _ {G} ( x ) + C _ {G} $, where $ C _ {G} $ is a constant independent of $ x $. The complexity of solution, $ KR (x) $, of the word $ x $ is the complexity $ KR _ {G _ {0} } (x) $ of some optimal partial recursive function $ G _ {0} $ which has been fixed once and for all. Clearly, $ KR (x) \leq K (x) + C $ and $ K(x) \leq KR (x) + 2 l (x) + C $. It is possible, by using $ KR (x) $, to determine, for any set $ M $ of natural numbers, the complexity $ K (M, n) $ for the $ n $- segment of the set $ M $: $ K (M, n) = KR ( \omega _ {n} ) $, where $ \omega = x _ {1} \dots x _ {i} \dots $ is the characteristic sequence of the set $ M $( $ x _ {i} = 1 $ if $ i \in M $, and $ x _ {i} = 0 $ if $ i \notin M $).

Algorithmic problems are usually represented as problems of belonging to some recursively enumerable set $ M $. If some value $ n $ is given, and the problem of belonging to $ M $ is restricted to the first $ n $ natural numbers only, then one obtains a bounded algorithmic problem. The magnitude $ K (M, n) $ may be intuitively understood as expressing the quantity of information required to solve this bounded problem. In particular, the set $ M $ is recursive if and only if $ K(M, n) $ is bounded from above by some constant.

It follows from the Martin-Löf theorem that there exist sets for which $ K (M, n) \sim n $. It was also found that sets with maximum degree of complexity exist even among sets defined by arithmetic predicates with two quantifiers. However, the following theorem applies to recursively enumerable sets: 1) for any recursively enumerable set $ M $ and any value $ n $, the inequality $ K ( M, n) \leq \mathop{\rm log} _ {2} n + C $, where $ C $ does not depend on $ n $, is valid; and 2) there exists a recursively enumerable set $ M $ such that for any $ n $ the inequality $ K (M,n) > \mathop{\rm log} _ {2} n $ holds. At the same time there exists a recursively enumerable set $ M $ such that if the time $ t $ of computation of an arbitrary general recursive function is bounded, the estimate $ K ^ {t} (M, n) \geq n/ c _ {t} $, where $ c _ {t} $ does not depend on $ n $, is valid.

Sets which are universal with respect to a certain type of reducibility may also be defined in these terms (cf. Universal set): A set $ M $ is weakly tabularly universal if and only if there exists an unbounded general recursive function $ f(n) $ such that the inequality $ K (M, n) \geq f(n) $ is valid for all values of $ n $.

In studying the complexity of bounded algorithmic problems other measures of complexity may also be employed, e.g. the minimum of the length of the coding of the normal algorithm which solves the bounded problem in question. It was found, however, that asymptotically exact relations (cf. Algorithm, complexity of description of an) exist between the complexities introduced above and the analogues of these complexities expressed as the minimum length of the coding of an algorithm (or as the minimum number of internal states of a Turing machine).

Another concept which is introduced in constructing algorithmic information theory is that of conditional complexity of a word $ x $, for a known $ y $, using a partial recursive function $ G (s, t) $:

$$ K _ {G} ( x \mid y ) = \ \left \{ \begin{array}{lt} \mathop{\rm min} \{ {l ( p ) } : {G ( p , y ) = x } \} , &\infty \ \textrm{ if } \textrm{ there } \textrm{ is } \textrm{ no } p \ \textrm{ such } \textrm{ that } G (p, y) = x. \\ \end{array} \right .$$

This concept is related to the theorem on the existence of an optimal function $ G _ {0} $, and the conditional complexity $ K (x \mid y) $ of a word $ x $, if $ y $ is known, is defined as the complexity $ K _ {G _ {0} } (x \mid y) $ using an optimal function $ G _ {0} $ fixed once and for all. The intuitive meaning of $ K(x \mid y) $ is the minimum quantity of information which must be added to the information contained in the word $ y $ for the reconstitution of the word $ x $ to be possible. Clearly, $ K (x \mid y) \leq K (x) + C $.

The next central concept of algorithmic information theory is the concept of the amount of information in an individual object $ y $ relative to an individual object $ x $:

$$ I ( y : x ) = K ( x ) - K ( x \mid y ) . $$

The quantity $ I (y:x) $ is called the algorithmic amount of information in $ y $ about $ x $. The quantities $ K(x) $ and $ K( x \mid y ) $ are sometimes called the algorithmic entropy of $ x $ and the algorithmic conditional entropy of $ x $ for a given $ y $. The formulas for the decomposition of the entropy $ H(X, Y) = H(X) + H(Y \mid X) $ and of the commutation of information $ I(X:Y) = I (Y:X) $ are valid in the algorithmic concept up to terms of order $ O ( \mathop{\rm log} H (X, Y) ) $:

$$ | K ( x , y ) - [ K ( x ) + K ( y \mid x ) ] | \leq O ( \mathop{\rm log} K ( x , y ) ) , $$

$$ | I ( x : y ) - I ( y : x ) | \leq O ( \mathop{\rm log} K ( x , y ) ) . $$

There exists a certain relation (due to Kolmogorov) between the algorithmic and the classical definitions of the amount of information (or, more exactly, between the complexity of a word according to Kolmogorov and the entropy of the Shannon distribution of frequencies): Let a number $ r $ be given, let the word $ x $ of length $ i \cdot r $ consist of $ i $ words of length $ r $, and let the $ k $- th word of length $ r $ appear in $ x $ with a frequency $ q _ {k} $( $ k = 1 \dots 2 ^ {r} $); one then has

$$ \frac{K ( x ) }{i} \leq H + \alpha ( i ) , $$

where

$$ H = - \sum _ {k = 1 } ^ { {2 } ^ {r} } q _ {k} \mathop{\rm log} _ {2} q _ {k} , $$

and $ \alpha (i) \rightarrow 0 $ as $ i \rightarrow \infty $. In the general case no closer connection can be established between entropy and complexity. This is because entropy is intended to study sequences subject to frequency laws only. Consequently, one may establish the following relation for random sequences $ \omega $ in measure between the magnitudes under study, which corresponds to independent trials:

$$ \lim\limits _ {i \rightarrow \infty } \frac{K ( \omega _ {ir} ) }{i} = H . $$

Similar considerations apply to an arbitrary ergodic stationary stochastic process.

References

[1] A.N. Kolmogorov, "Three approaches to the quantitative definition of information" Problems in Information Transmission , 1 (1965) pp. 1–7 Problemy Peredachi Informatsii , 1 : 1 (1965) pp. 3–11
[2] A.N. Kolmogorov, "On the logical foundations of information theory and probability theory" Problems Inform. Transmission , 5 : 3 (1969) pp. 1–4 Problemy Peredachi Informatsii , 5 : 3 (1969) pp. 3–7
[3] A.K. Zvonkin, L.A. Levin, "The complexity of finite objects and the developments of the concepts of information and randomness by means of the theory of algorithms" Russian Math. Surveys , 25 : 6 (1970) pp. 82–124 Uspekhi Mat. Nauk , 25 : 6 (1970) pp. 85–127
[4] Ya.M. Bardzin', "Complexity of programs to determine whether natural numbers not greater than $n$ belong to a recursively enumerable set" Soviet Math. Dokl. , 9 : 5 (1968) pp. 1251–1254 Dokl. Akad. Nauk SSSR , 182 : 6 (1968) pp. 1249–1252
[5] M.I. Kanovich, "Complexity of resolution of a recursively enumerable set as a criterion of its universality" Soviet Math. Dokl. , 11 : 5 (1970) pp. 1224–1228 Dokl. Akad. Nauk SSSR , 194 : 3 (1970) pp. 500–503

Comments

The theory of Kolmogorov complexity, or "algorithmic information theory" , originated in the independent work of R.J. Solomonoff, Kolmogorov, L. Levin, Martin-Löf and G.J. Chaitin. Recently, J. Hartmanis has refined the theory by also considering the amount of work (time complexity) involved in reconstructing the original data from its description.

References

[a1] J. Hartmanis, "Generalized Kolmogorov complexity and the structure of feasible computations" , Proc. 24th IEEE Symposium on Foundations of Computer Science (1983) pp. 439–445
[a2] R.J. Solomonoff, "A formal theory of inductive inference, Part I and Part II" Information and Control , 7 (1964) pp. 1–22
[a3] P. Martin-Löf, "The definition of random sequences" Information and Control , 9 (1966) pp. 602–619
[a4] G.J. Chaitin, "Randomness and mathematical proof" Scientific American , May (1975) pp. 47–52
[a5] G.J. Chaitin, "Algorithmic information theory" IBM J. Res. Dev. , 21 (1977) pp. 350–359
How to Cite This Entry:
Algorithmic information theory. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Algorithmic_information_theory&oldid=19163
This article was adapted from an original article by Ya.M. Barzdin' (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article