# Code with correction of deletions and insertions

code for correction of deletions and insertions

A code intended for the correction of errors of two types encountered in the transmission of information. By a deletion of a letter in a word $\beta = b _ {1} \dots b _ {n}$ of length $n$ over some alphabet $B$ one means a transformation of $\beta$ into a word $\beta ^ \prime = b _ {1} \dots b _ {i-1} b _ {i+1} \dots b _ {n}$ of length $n - 1$, $1 \leq i \leq n$. The numbers $N _ {s} ( \beta )$ of words obtainable from $\beta$ by deletion of $s$ letters satisfies the following estimates

$$\left ( \begin{array}{c} \tau ( \beta ) - s + 1 \\ s \end{array} \right ) \ \leq \ N _ {s} ( \beta ) \ \leq \ \left ( \begin{array}{c} \tau ( \beta ) + s - 1 \\ s \end{array} \right ) .$$

Here $\tau ( \beta )$ is the number of series of $\beta$( by a series of a word $\beta = b _ {1} \dots b _ {n}$ one means a word $b _ {i+1} \dots b _ {j}$, $j \geq i + 1$, such that 1) $b _ {i+1} = \dots = b _ {j}$; 2) if $i \geq 1$, then $b _ {i} \neq b _ {i+1}$; and 3) if $j < n$, then $b _ {j+1} \neq b _ {j}$). In particular, $N _ {1} ( \beta ) = \tau ( \beta )$. By an insertion of a letter in a word $\beta = b _ {1} \dots b _ {n}$ one means a transformation of $\beta$ into a word $\beta ^ \prime = b _ {1} \dots b _ {i} b b _ {i+1} \dots b _ {n}$ of length $n + 1$, where $b \in B$ and $0 \leq i \leq n$. The number of words obtainable from an arbitrary word $\beta$ of length $n$ by the insertion of $s$ letters of an alphabet $B$ is equal to

$$\sum _ { j=0 } ^ { s } \left ( \begin{array}{c} n + s \\ j \end{array} \right ) ( r - 1 ) ^ {j} ,$$

where $r$ is the number of letters in $B$. A set $K$ of words over the alphabet $B$ is called a code for correction of $s$ deletions (insertions, deletions or insertions) if no word in $B$ can be obtained from two distinct words of $K$ as the result of $s$ or fewer deletions (insertions, deletions or insertions) of letters in each of them. The function defined on the pairs $( \beta _ {1} ,\ \beta _ {2} )$ of words in $B$ and equal to the minimum number of deletions and insertions of letters converting $\beta _ {1}$ into $\beta _ {2}$ is a metric. A set $K$ of words over the alphabet $B$ is a code for correction of $s$ deletions (insertions, deletions or insertions) if and only if the distance between any two distinct words of $K$ is greater than $2 s$, so that the above three definitions of a code are equivalent. An example of a code for correction of one deletion or one insertion is the set of words $\{ \beta = b _ {1} \dots b _ {n} \}$ of length $n$ over the alphabet $B = \{ 0 ,\ 1 \}$ for which $\sum _ {i=1} ^ {n} i b _ {i} \equiv 0$( $\mathop{\rm mod} \ n + 1$). The number of words in this code is equal to

$$\frac{1}{2 ( n + 1 )} \sum \phi (d) 2 ^ {( n + 1 ) / d} ,$$

where the sum is taken over all odd divisors $d$ of $n + 1$ and $\phi (d)$ is the Euler function; it is asymptotically maximal as $n \rightarrow \infty$.

How to Cite This Entry:
Code with correction of deletions and insertions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Code_with_correction_of_deletions_and_insertions&oldid=44396
This article was adapted from an original article by V.I. Levenshtein (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article