# Complexity function

2020 Mathematics Subject Classification: *Primary:* 68R15 [MSN][ZBL]

*of a word $w$*

The complexity function of a word $w$ (finite, infinite or bi-infinite) over a finite alphabet $A$ is the function $p_w(n)$ that counts the number of distinct factors (substrings of consecutive symbols) of length $n$ in $w$. More generally, the complexity function of a language, a set of finite words over an alphabet, counts the number of distinct words of given length.

For a string $u$ of length at least $n$ over an alphabet of size $k$ we clearly have $$ 1 \le p_u(n) \le k^n \ , $$ the bounds being achieved by the constant word and a disjunctive word (for example, the Champernowne word) respectively. For infinite words $u$, we have $p_u(n)$ bounded if $u$ is ultimately periodic (a finite, possibly empty, sequence followed by a finite cycle). Conversely, if $p_u(n) \le n$ for some $n$, then $u$ is ultimately periodic.

An **aperiodic sequence** is one which is not ultimately periodic. An aperiodic sequence has strictly increasing complexity function (this is the **Morse–Hedlund theorem**), so $p_u(n) \ge n+1$.

A set $S$ of finite binary words is *balanced* if for each $n$ the subset $S_n$ of words of length $n$ has the property that the Hamming weight of the words in $S_n$ takes at most two distinct values. A **balanced sequence** is one for which the set of factors is balanced. A balanced sequence has complexity function $p(n) \le n+1$.

A **Sturmian word** over a binary alphabet is one with complexity function $p(n) = n+1$ A sequence is Sturmian if and only if it is balanced and aperiodic. An example is the Fibonacci word. More generally, a Sturmian word over an alphabet of size $k$ is one with complexity $p(n) = n+k-1$. An Arnoux-Rauzy word over a ternary alphabet has complexity $2n+1$: an example is the Tribonacci word.

Let $L$ be a language over an alphabet and define the function $P_L(n)$ of a positive integer $n$ to be the number of different words of length $n$ in $L$ The complexity function of a word is thus the complexity function of the language consisting of the factors of that word.

The complexity function of a language is less constrained than that of a word. For example, it may be bounded but not eventually constant: the complexity function of the regular language $a(bb)^*a$ takes values 0 and 1 on odd and even $n \ge 2$ respectively. There is an analogue of the Morse–Hedlund theorem: if the complexity of $L$ satisfies $p_L(n) \le n$ for some $n$, then $p_L$ is bounded and there is a finite language $F$ such that
$$
L \subseteq \{ x y^k z : x,y,z \in F,\ k \in \mathbb{N} \} \ .
$$
A *polynomial* or *sparse language* is one for which the complexity function $p(n)$ is bounded by a fixed power of $n$. An *exponential* language is one for which there exists $k>1$ such that there are infinitely many $n$ for which $p(n) > k^n$. Words exist with complexity functions having growth intermediate between polynomial and exponential; however, a regular language is either polynomial or exponential.

The *topological entropy* of an infinite sequence $u$ is defined by
$$
H_{\mathrm{top}}(u) = \lim_{n \rightarrow \infty} \frac{\log p_u(n)}{n \log k} \ .
$$

The limit exists as the logarithm of the complexity function is a subadditive function: indeed, $p(m+n) \le p(m) \cdot p(n)$. Every real number between 0 and 1 occurs as the topological entropy of some sequence, which may be taken to be a uniformly recurrent word or even uniquely ergodic.

For $x$ a real number and $b$ an integer $\ge 2$ then the complexity function of $x$ base $b$ is the complexity function $p_{x,b}(n)$ of the sequence of digits of $x$ written in base $b$.
If $x$ is an irrational number then $p_{x,b}(n) \ge n+1$; if *x* is rational then $p_{x,b}(n) \le C$ for some constant $C$ depending on $x$ and $b$. It is conjectured that for algebraic irrational $x$ the complexity is $b^n$ (which would follow if all such numbers were normal) but all that is known in this case is that $p_{x,b}(n)$ grows faster than any linear function of $n$.

The **abelian complexity function** $p^{\text{ab}}(n)$ similarly counts the number of occurrences of distinct factors of given length $n$, where now we identify factors that differ only by a permutation of the positions. Clearly $p^{\text{ab}}(n) \le p(n)$. The abelian complexity of a Sturmian sequence satisfies $p^{\text{ab}}(n) = 2$.

#### References

- Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press.
**ISBN**978-0-521-82332-6 Zbl 1086.11015 - Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. 27. Providence, RI: American Mathematical Society.
**ISBN**978-0-8218-4480-9 Zbl 1161.68043 - Berthé, Valérie; Rigo, Michel, eds. (2010). Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press.
**ISBN**978-0-521-51597-9 Zbl 1197.68006 - Bugeaud, Yann (2012). Distribution modulo one and Diophantine approximation. Cambridge Tracts in Mathematics. 193. Cambridge: Cambridge University Press.
**ISBN**978-0-521-11169-0 Zbl 1260.11001. - Cassaigne, Julien; Nicolas, François (2010). "Factor complexity". In Berthé, Valérie; Rigo, Michel. Combinatorics, automata, and number theory. Encyclopedia of Mathematics and its Applications. 135. Cambridge: Cambridge University Press. pp. 163–247.
**ISBN**978-0-521-51597-9 Zbl 1216.68204 - Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press.
**ISBN**978-0-521-18071-9 Zbl 1221.68183 - Pytheas Fogg, N. (2002). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. 1794. Editors Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. Berlin: Springer-Verlag.
**ISBN**3-540-44141-7 Zbl 1014.11015

**How to Cite This Entry:**

Complexity function.

*Encyclopedia of Mathematics.*URL: http://encyclopediaofmath.org/index.php?title=Complexity_function&oldid=54528