# Extrapolation algorithm

2010 Mathematics Subject Classification: Primary: 65-02 Secondary: 41A5865B9965D30 [MSN][ZBL]

In numerical analysis and in applied mathematics, many methods produce sequences of numbers or vectors $(S_n)$ converging to a limit $S$. This is the case in iterative methods but also when a result $S(h)$ depends on a parameter $h$. An example is the trapezium formula for approximating a definite integral, or when a sequence of step-sizes $(h_n)$ is used, thus leading to the sequence $(S_n=S(h_n))$. Quite often in practice, the convergence of $(S_n)$ is slow and needs to be accelerated. For this purpose, $(S_n)$ is transformed, by a sequence transformation $T$, into a new sequence $(T_n)$ with the hope that $(T_n)$ will converge to the same limit faster than $(S_n)$ or, in other words, that $T$ will accelerate the convergence of $(S_n)$, which means $$\lim_{n\to\infty} \frac{\|T_n-S\|}{\|S_n-S\|} = 0.$$

## Construction of a sequence transformation in the scalar case.

First, it is assumed that $(S_n)$ behaves as a certain function $R$ of $n$ depending on $k+1$ parameters $a_1,\dots,a_k$ and $s$, and also, maybe, on some terms of the sequence $(S_n)$. These parameters are determined by imposing the interpolation conditions

$$S_{n+i} = R(n+i,s,a_1,\dots,a_k),\quad i = 0,\dots,k.\tag{a1}$$ Then $s$ is taken as an approximation of the limit $S$ of the sequence $(S_n)$. Obviously, $a_1,\dots,a_k$ and $s$, obtained as the solution of (a1), depend on $n$. For that reason, $s$ will be denoted by $T_n$, which defines the sequence transformation $T:(S_n) \to(T_n)$. If $(S_n)$ satisfies (a1) for all $n$, where $s$ and the $a_i$ are constants independent of $n$, then, by construction, $T_n = s$ for all $n$. Quite often, this condition is also necessary. The set of sequences satisfying this condition is called the kernel of the transformation $T$.

A sequence transformation constructed by such a procedure is called an extrapolation method.

## Example.

Assume that $(S_n)$ satisfies, for all $n$, $S_n = s + a_1a_2^n$ with $a_2 \ne 1$. Writing down (a1) with $k=2$, and subtracting one relation from the next one, gives $$\Delta S_{n+i} = S_{n+i+1} - S_{n+i} = a_1a_2^{n+i}(a_2-1)$$ for $i=0,1$. Thus, $a_2 = \Delta S_{n+1}/{\Delta S_n}$. Also, $a_1a_2^n = \Delta S_n/(a_2-1)$, which gives $a_1 a_2^n = (\Delta S_n)^2 /\Delta^2 S_n$ and finally $$s = S_n -a_1 a_2^n = T_n = S_n - \frac{(\Delta S_n)^2}{\Delta^2 S_n},$$ which is nothing else but the Aitken $\Delta^2$ process. Another way of recovering this process is to assume that the sequence $(S_n)$ satisfies, for all $n$, $a_1(S_n-s) + a_2(S_{n+1}-s) = 0$ with $a_1+a_2 \ne 0$. So, the generality is not restricted by assuming that $a_1+a_2 = 1$. As above, one finds by subtraction $(1-a_2)\Delta S_n + a_2 \Delta S_{n+1} = 0$, which leads to $a_2 = - \Delta S_n/ \Delta^2 S_n$ and finally to $s = T_n = S_n+a_2 \Delta S_n$, which is the Aitken process again. It can also be written as $$T_n = \frac{\begin{vmatrix} S_n & S_{n+1}\\ \Delta S_n & \Delta S_{n+1}\end{vmatrix}} {\begin{vmatrix} 1 & 1\\ \Delta S_n & \Delta S_{n+1}\end{vmatrix}}.\tag{a2}$$ Most sequence transformations can be written as a quotient of two determinants. As mentioned above, the kernel of a transformation depends on an integer $k$. To indicate this, denote $T_n$ by $T_k^{(n)}$. Thus, the problem of the recursive computation of the $T_k^{(n)}$ without computing these determinants arises. It can be proved (see, for example, [BrReZa], pp. 18–26) that, since these quantities can be written as a quotient of two determinants, they can be recursively computed, for increasing values of $k$, by a triangular recursive scheme, which means that $T_{k+1}^{(n)}$ is obtained from $T_k^{(n)}$ and $T_k^{(n+1)}$. Such a procedure is called an extrapolation algorithm. The converse of this result is also true, namely that any array of numbers $\{T_k^{(n)}\}$ that can be computed by a triangular recursive scheme can be written as a ratio of two determinants.

## $E$-algorithm.

The most general extrapolation process currently known is the $E$-algorithm. Its kernel is the set of sequences such that $S_n = s + a_1g_1(n) + \cdots + a_k g_k(n)$ for all $n$, where the $(g_i(n))$ are known auxiliary sequences which can depend on certain terms of the sequence $(S_n)$ itself. Solving (a1), it is easy to see that $$T_k^{(n)} = \frac{D_k^{(n)} [(S)]}{D_k^{(n)}[(1)]},\tag{a3}$$ where, for an arbitrary sequence $(u) = (u_n)$, $$D_k^{(n)}[(u)] = \begin{vmatrix} u_n & \dots & u_{n+k} \\ g_1(n) & \dots & g_1(n+k) \\ \vdots & \dots & \vdots \\ g_k(n) & \dots & g_k(n+k) \\ \end{vmatrix}$$ and where $(S)$ denotes the sequence $(S_n)$ and $(1)$ the sequence whose terms are all equal to one.

These quantities can be recursively computed by the $E$-algorithm, whose rules are as follows: for $k, n = 0,1,\dots$, $$T_{k+1}^{(n)} = T_k^{(n)} - \frac{\Delta T_k^{(n)}}{\Delta g_{k,k+1}^{(n)}} g_{k,k+1}^{(n)}$$

$$g_{k+1,i}^{(n)} = g_{k,i}^{(n)} - \frac{\Delta g_{k,i}^{(n)}}{\Delta g_{k,k+1}^{(n)}} g_{k,k+1}^{(n)}, \quad i>k+1,$$ with $T_0^{(n)} = S_n$ and $g_{0,i}^{(n)} = g_i(n)$ and where the operator $\Delta$ acts on the upper indices $n$.

For the $E$-algorithm it can be proved that if $S_n = S+a_1g_1(n) + a_2 g_2(n)+\cdots$, where the $(g_i)$ form an asymptotic series (that is, $g_{i+1}(n) = o(g_i(n))$ when $n$ goes to infinity) and under certain additional technical assumptions, then, for any fixed value of $k$, $(T_{ k+1 }^{(n)})$ tends to $S$ faster than $(T_k^{(n)})$ as $n$ tends to infinity. This result is quite important since it shows that, for accelerating the convergence of a sequence $(S_n)$, it is necessary to know an asymptotic expansion of the error $S_n-S$. Thus, there is a close connection between extrapolation and asymptotics, as explained in [Wa].

## Generalization.

The Aitken process was generalized by D. Shanks, who considered a kernel of the form $$a_1(S_n-s) + \cdots + a_k(S_{n+k-1}-s) = 0$$ with $a_1+\cdots+a_k \ne 0$. The corresponding $(T_k^{(n)})$ are given by the ratio of determinants (a3) with, in this case, $g_i(n) = \Delta S_{n+i-1}$. It is an extension of (a2). These $(T_k^{(n)})$ can be recursively computed by the $\def\e{\varepsilon} \e$-algorithm of P. Wynn, whose rules are: $$\e_{k+1}^{(n)} = \e_{k-1}^{(n+1)} + [ \e_{k}^{(n+1)} - \e_{k}^{(n)}]^{-1},\ k,n=0,1,\dots,$$ with $\e_{-1}^{(n)} = 0$ and $\e_{0}^{(n)} = S_n$ for $n=0,1,\dots$, and one obtains $\e_{2k}^{(n)} = T_k^{(n)}$. The quantities $\e_{2k+1}^{(n)}$ are intermediate results. This algorithm is related to Padé approximation. Indeed, if $(S_n)$ is the sequence of partial sums of a series $f$ at the point $t$, then $\e_{2k}^{(n)} = [(n+k)/k]_f(t)$.

Among the well-known extrapolation algorithms, there is also the Richardson process (cf. also Richardson extrapolation), whose kernel is the set of sequences of the form $S_n = s + a_1x_n + \cdots + a_k x_n^k$ where $(x_n)$ is an auxiliary known sequence. Thus, this process corresponds to polynomial extrapolation at the point $0$. The $T_k^{(n)}$ can again be written as (a3) with $g_i(n) = x_k^i$ and they can be recursively computed by the Neville–Aitken scheme for constructing the interpolation polynomial.

Obviously, an extrapolation algorithm will provide good approximations of the limit $S$ of the sequence $(S_n)$ or, in other words, the transformation $T$ will accelerate the convergence, if the function $R$ in (a1) well describes the exact behaviour of the sequence $(S_n)$. This is the case, for example, in the Romberg method for accelerating the convergence of the sequence obtained by the trapezium formula for computing a definite integral. Indeed, using Euler–MacLaurin expansion (cf. Euler–MacLaurin formula), it can be proved that the error can be written as a series in $h^2$ and the Romberg method is based on polynomial extrapolation at $0$ by a polynomial in $h^2$. Note that, sometimes, extrapolation algorithms are able to sum diverging sequences and series.

There exist many other extrapolation processes. It is important to define many such processes since each of them is only able to accelerate the convergence of certain classes of sequences and, as has been proved by J.P. Delahaye and B. Germain-Bonne [DeGe], a universal algorithm able to accelerate the convergence of all convergent sequences cannot exist. This is because this class is too large. Even for smaller classes, such as the class of monotonic sequences, such a universal algorithm cannot exist.

## Vector sequences.

Clearly, for accelerating the convergence of vector sequences it is possible to use a scalar extrapolation method for each component of the vectors. However, in practice, vector sequences are often generated by an iterative process and applying a scalar transformation separately on each component does not take into account the connections existing between the various components. Thus, it is better to use an extrapolation algorithm specially built for vector sequences. So, there exists vector variants of most scalar algorithms. Quite often, such processes are related to projection methods [Br].

On extrapolation methods, see [BrReZa], [De] and [Wi]. FORTRAN subroutines of many extrapolation algorithms can be found in [BrReZa]. Various applications are described in [We].

How to Cite This Entry:
Extrapolation algorithm. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Extrapolation_algorithm&oldid=21562
This article was adapted from an original article by C. Brezinski (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article