Namespaces
Variants
Actions

Complexity function

From Encyclopedia of Mathematics
Jump to: navigation, search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

2020 Mathematics Subject Classification: Primary: 68R15 [MSN][ZBL]

of a word $w$

The complexity function of a word $w$ (finite, infinite or bi-infinite) over a finite alphabet $A$ is the function $p_w(n)$ that counts the number of distinct factors (substrings of consecutive symbols) of length $n$ in $w$. More generally, the complexity function of a language, a set of finite words over an alphabet, counts the number of distinct words of given length.

For a string $u$ of length at least $n$ over an alphabet of size $k$ we clearly have $$ 1 \le p_u(n) \le k^n \ , $$ the bounds being achieved by the constant word and a disjunctive word (for example, the Champernowne word) respectively. For infinite words $u$, we have $p_u(n)$ bounded if $u$ is ultimately periodic (a finite, possibly empty, sequence followed by a finite cycle). Conversely, if $p_u(n) \le n$ for some $n$, then $u$ is ultimately periodic.

An aperiodic sequence is one which is not ultimately periodic. An aperiodic sequence has strictly increasing complexity function (this is the Morse–Hedlund theorem), so $p_u(n) \ge n+1$.

A set $S$ of finite binary words is balanced if for each $n$ the subset $S_n$ of words of length $n$ has the property that the Hamming weight of the words in $S_n$ takes at most two distinct values. A balanced sequence is one for which the set of factors is balanced. A balanced sequence has complexity function $p(n) \le n+1$.

A Sturmian word over a binary alphabet is one with complexity function $p(n) = n+1$ A sequence is Sturmian if and only if it is balanced and aperiodic. An example is the Fibonacci word. More generally, a Sturmian word over an alphabet of size $k$ is one with complexity $p(n) = n+k-1$. An Arnoux-Rauzy word over a ternary alphabet has complexity $2n+1$: an example is the Tribonacci word.

Let $L$ be a language over an alphabet and define the function $P_L(n)$ of a positive integer $n$ to be the number of different words of length $n$ in $L$ The complexity function of a word is thus the complexity function of the language consisting of the factors of that word.

The complexity function of a language is less constrained than that of a word. For example, it may be bounded but not eventually constant: the complexity function of the regular language $a(bb)^*a$ takes values 0 and 1 on odd and even $n \ge 2$ respectively. There is an analogue of the Morse–Hedlund theorem: if the complexity of $L$ satisfies $p_L(n) \le n$ for some $n$, then $p_L$ is bounded and there is a finite language $F$ such that $$ L \subseteq \{ x y^k z : x,y,z \in F,\ k \in \mathbb{N} \} \ . $$ A polynomial or sparse language is one for which the complexity function $p(n)$ is bounded by a fixed power of $n$. An exponential language is one for which there exists $k>1$ such that there are infinitely many $n$ for which $p(n) > k^n$. Words exist with complexity functions having growth intermediate between polynomial and exponential; however, a regular language is either polynomial or exponential.

The topological entropy of an infinite sequence $u$ is defined by $$ H_{\mathrm{top}}(u) = \lim_{n \rightarrow \infty} \frac{\log p_u(n)}{n \log k} \ . $$

The limit exists as the logarithm of the complexity function is a subadditive function: indeed, $p(m+n) \le p(m) \cdot p(n)$. Every real number between 0 and 1 occurs as the topological entropy of some sequence, which may be taken to be a uniformly recurrent word or even uniquely ergodic.

For $x$ a real number and $b$ an integer $\ge 2$ then the complexity function of $x$ base $b$ is the complexity function $p_{x,b}(n)$ of the sequence of digits of $x$ written in base $b$. If $x$ is an irrational number then $p_{x,b}(n) \ge n+1$; if x is rational then $p_{x,b}(n) \le C$ for some constant $C$ depending on $x$ and $b$. It is conjectured that for algebraic irrational $x$ the complexity is $b^n$ (which would follow if all such numbers were normal) but all that is known in this case is that $p_{x,b}(n)$ grows faster than any linear function of $n$.

The abelian complexity function $p^{\text{ab}}(n)$ similarly counts the number of occurrences of distinct factors of given length $n$, where now we identify factors that differ only by a permutation of the positions. Clearly $p^{\text{ab}}(n) \le p(n)$. The abelian complexity of a Sturmian sequence satisfies $p^{\text{ab}}(n) = 2$.

References

  • Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. ISBN 978-0-521-82332-6 Zbl 1086.11015
  • Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9 Zbl 1161.68043
  • Berthé, Valérie; Rigo, Michel, eds. (2010). Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press. ISBN 978-0-521-51597-9 Zbl 1197.68006
  • Bugeaud, Yann (2012). Distribution modulo one and Diophantine approximation. Cambridge Tracts in Mathematics. 193. Cambridge: Cambridge University Press. ISBN 978-0-521-11169-0 Zbl 1260.11001.
  • Cassaigne, Julien; Nicolas, François (2010). "Factor complexity". In Berthé, Valérie; Rigo, Michel. Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press. pp. 163–247. ISBN 978-0-521-51597-9 Zbl 1216.68204
  • Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9 Zbl 1221.68183
  • Pytheas Fogg, N. (2002). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. 1794. Editors Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. Berlin: Springer-Verlag. ISBN 3-540-44141-7 Zbl 1014.11015
How to Cite This Entry:
Complexity function. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Complexity_function&oldid=54528