Namespaces
Variants
Actions

Complexity function

From Encyclopedia of Mathematics
Jump to: navigation, search

2020 Mathematics Subject Classification: Primary: 68R15 [MSN][ZBL]

of a word

The complexity function of a word w (finite, infinite or bi-infinite) over a finite alphabet A is the function p_w(n) that counts the number of distinct factors (substrings of consecutive symbols) of length n in w. More generally, the complexity function of a language, a set of finite words over an alphabet, counts the number of distinct words of given length.

For a string u of length at least n over an alphabet of size k we clearly have 1 \le p_u(n) \le k^n \ , the bounds being achieved by the constant word and a disjunctive word (for example, the Champernowne word) respectively. For infinite words u, we have p_u(n) bounded if u is ultimately periodic (a finite, possibly empty, sequence followed by a finite cycle). Conversely, if p_u(n) \le n for some n, then u is ultimately periodic.

An aperiodic sequence is one which is not ultimately periodic. An aperiodic sequence has strictly increasing complexity function (this is the Morse–Hedlund theorem), so p_u(n) \ge n+1.

A set S of finite binary words is balanced if for each n the subset S_n of words of length n has the property that the Hamming weight of the words in S_n takes at most two distinct values. A balanced sequence is one for which the set of factors is balanced. A balanced sequence has complexity function p(n) \le n+1.

A Sturmian word over a binary alphabet is one with complexity function p(n) = n+1 A sequence is Sturmian if and only if it is balanced and aperiodic. An example is the Fibonacci word. More generally, a Sturmian word over an alphabet of size k is one with complexity p(n) = n+k-1. An Arnoux-Rauzy word over a ternary alphabet has complexity 2n+1: an example is the Tribonacci word.

Let L be a language over an alphabet and define the function P_L(n) of a positive integer n to be the number of different words of length n in L The complexity function of a word is thus the complexity function of the language consisting of the factors of that word.

The complexity function of a language is less constrained than that of a word. For example, it may be bounded but not eventually constant: the complexity function of the regular language a(bb)^*a takes values 0 and 1 on odd and even n \ge 2 respectively. There is an analogue of the Morse–Hedlund theorem: if the complexity of L satisfies p_L(n) \le n for some n, then p_L is bounded and there is a finite language F such that L \subseteq \{ x y^k z : x,y,z \in F,\ k \in \mathbb{N} \} \ . A polynomial or sparse language is one for which the complexity function p(n) is bounded by a fixed power of n. An exponential language is one for which there exists k>1 such that there are infinitely many n for which p(n) > k^n. Words exist with complexity functions having growth intermediate between polynomial and exponential; however, a regular language is either polynomial or exponential.

The topological entropy of an infinite sequence u is defined by H_{\mathrm{top}}(u) = \lim_{n \rightarrow \infty} \frac{\log p_u(n)}{n \log k} \ .

The limit exists as the logarithm of the complexity function is a subadditive function: indeed, p(m+n) \le p(m) \cdot p(n). Every real number between 0 and 1 occurs as the topological entropy of some sequence, which may be taken to be a uniformly recurrent word or even uniquely ergodic.

For x a real number and b an integer \ge 2 then the complexity function of x base b is the complexity function p_{x,b}(n) of the sequence of digits of x written in base b. If x is an irrational number then p_{x,b}(n) \ge n+1; if x is rational then p_{x,b}(n) \le C for some constant C depending on x and b. It is conjectured that for algebraic irrational x the complexity is b^n (which would follow if all such numbers were normal) but all that is known in this case is that p_{x,b}(n) grows faster than any linear function of n.

The abelian complexity function p^{\text{ab}}(n) similarly counts the number of occurrences of distinct factors of given length n, where now we identify factors that differ only by a permutation of the positions. Clearly p^{\text{ab}}(n) \le p(n). The abelian complexity of a Sturmian sequence satisfies p^{\text{ab}}(n) = 2.

References

  • Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. ISBN 978-0-521-82332-6 Zbl 1086.11015
  • Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9 Zbl 1161.68043
  • Berthé, Valérie; Rigo, Michel, eds. (2010). Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press. ISBN 978-0-521-51597-9 Zbl 1197.68006
  • Bugeaud, Yann (2012). Distribution modulo one and Diophantine approximation. Cambridge Tracts in Mathematics. 193. Cambridge: Cambridge University Press. ISBN 978-0-521-11169-0 Zbl 1260.11001.
  • Cassaigne, Julien; Nicolas, François (2010). "Factor complexity". In Berthé, Valérie; Rigo, Michel. Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press. pp. 163–247. ISBN 978-0-521-51597-9 Zbl 1216.68204
  • Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9 Zbl 1221.68183
  • Pytheas Fogg, N. (2002). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. 1794. Editors Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. Berlin: Springer-Verlag. ISBN 3-540-44141-7 Zbl 1014.11015
How to Cite This Entry:
Complexity function. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Complexity_function&oldid=54528