| 
				   | 
				
| (5 intermediate revisions by 2 users not shown) | 
| Line 1: | 
Line 1: | 
| − | In [[Numerical analysis|numerical analysis]] and in applied mathematics, many methods produce sequences of numbers or vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202801.png" /> converging to a limit <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202802.png" />. This is the case in iterative methods but also when a result <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202803.png" /> depends on a parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202804.png" />. An example is the [[Trapezium formula|trapezium formula]] for approximating a definite [[Integral|integral]], or when a sequence of step-sizes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202805.png" /> is used, thus leading to the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202806.png" />. Quite often in practice, the convergence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202807.png" /> is slow and needs to be accelerated. For this purpose, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202808.png" /> is transformed, by a sequence transformation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e1202809.png" />, into a new sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028010.png" /> with the hope that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028011.png" /> will converge to the same limit faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028012.png" /> or, in other words, that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028013.png" /> will accelerate the convergence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028014.png" />, which means
  | + | {{MSC|65-02|41A58,65B99,65D30}}  | 
|   | + | {{TEX|done}}  | 
|   |  |   |  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028015.png" /></td> </tr></table>
  |   | 
|   |  |   |  | 
|   | + | In  | 
|   | + | [[Numerical analysis|numerical analysis]] and in applied mathematics,  | 
|   | + | many methods produce sequences of numbers or vectors $(S_n)$ converging to  | 
|   | + | a limit $S$. This is the case in iterative methods but also when a  | 
|   | + | result $S(h)$ depends on a parameter $h$. An example is the  | 
|   | + | [[Trapezium formula|trapezium formula]] for approximating a definite  | 
|   | + | [[Integral|integral]], or when a sequence of step-sizes $(h_n)$ is used,  | 
|   | + | thus leading to the sequence $(S_n=S(h_n))$. Quite often in practice, the  | 
|   | + | convergence of $(S_n)$ is slow and needs to be accelerated. For this  | 
|   | + | purpose, $(S_n)$ is transformed, by a sequence transformation $T$, into a  | 
|   | + | new sequence $(T_n)$ with the hope that $(T_n)$ will converge to the same  | 
|   | + | limit faster than $(S_n)$ or, in other words, that $T$ will accelerate the  | 
|   | + | convergence of $(S_n)$, which means   | 
|   | + | $$\lim_{n\to\infty} \frac{\|T_n-S\|}{\|S_n-S\|} = 0.$$  | 
|   | ==Construction of a sequence transformation in the scalar case.==  |   | ==Construction of a sequence transformation in the scalar case.==  | 
| − | First, it is assumed that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028016.png" /> behaves as a certain function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028017.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028018.png" /> depending on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028019.png" /> parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028020.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028021.png" />, and also, maybe, on some terms of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028022.png" />. These parameters are determined by imposing the interpolation conditions  | + | First, it is assumed that $(S_n)$  | 
|   | + | behaves as a certain function $R$ of $n$ depending on $k+1$ parameters  | 
|   | + | $a_1,\dots,a_k$ and $s$, and also, maybe, on some terms of the sequence $(S_n)$. These  | 
|   | + | parameters are determined by imposing the interpolation conditions  | 
|   |  |   |  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028023.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
  | + | $$S_{n+i} = R(n+i,s,a_1,\dots,a_k),\quad i = 0,\dots,k.\tag{a1}$$  | 
|   | + | Then $s$ is taken as an approximation of the limit $S$ of the  | 
|   | + | sequence $(S_n)$. Obviously, $a_1,\dots,a_k$ and $s$, obtained as the solution of  | 
|   | + | (a1), depend on $n$. For that reason, $s$ will be denoted by $T_n$,  | 
|   | + | which defines the sequence transformation $T:(S_n) \to(T_n)$. If $(S_n)$ satisfies (a1)  | 
|   | + | for all $n$, where $s$ and the $a_i$ are constants independent of $n$,  | 
|   | + | then, by construction, $T_n = s$ for all $n$. Quite often, this condition is  | 
|   | + | also necessary. The set of sequences satisfying this condition is  | 
|   | + | called the kernel of the transformation $T$.  | 
|   |  |   |  | 
| − | Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028024.png" /> is taken as an approximation of the limit <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028025.png" /> of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028026.png" />. Obviously, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028027.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028028.png" />, obtained as the solution of (a1), depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028029.png" />. For that reason, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028030.png" /> will be denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028031.png" />, which defines the sequence transformation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028032.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028033.png" /> satisfies (a1) for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028034.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028035.png" /> and the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028036.png" /> are constants independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028037.png" />, then, by construction, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028038.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028039.png" />. Quite often, this condition is also necessary. The set of sequences satisfying this condition is called the kernel of the transformation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028040.png" />.
  | + | A sequence transformation constructed by such a procedure is called an  | 
|   | + | extrapolation method.  | 
|   |  |   |  | 
| − | A sequence transformation constructed by such a procedure is called an extrapolation method.
  | + | ==Example.==  | 
|   | + | Assume that $(S_n)$ satisfies, for all $n$, $S_n = s + a_1a_2^n$ with  | 
|   | + | $a_2 \ne 1$. Writing down (a1) with $k=2$, and subtracting one relation from the  | 
|   | + | next one, gives   | 
|   | + | $$\Delta S_{n+i} = S_{n+i+1} - S_{n+i} = a_1a_2^{n+i}(a_2-1) $$  | 
|   | + | for $i=0,1$. Thus, $a_2 = \Delta S_{n+1}/{\Delta S_n}$. Also, $a_1a_2^n = \Delta S_n/(a_2-1)$, which gives $a_1 a_2^n = (\Delta S_n)^2 /\Delta^2 S_n$  | 
|   | + | and finally   | 
|   | + | $$s = S_n -a_1 a_2^n = T_n = S_n - \frac{(\Delta S_n)^2}{\Delta^2 S_n},$$  | 
|   | + | which is nothing else but the  | 
|   | + | [[Aitken Delta^2 process|Aitken $\Delta^2$ process]]. Another way of  | 
|   | + | recovering this process is to assume that the sequence $(S_n)$ satisfies,  | 
|   | + | for all $n$, $a_1(S_n-s) + a_2(S_{n+1}-s) = 0$ with $a_1+a_2 \ne 0$. So, the generality is not restricted by  | 
|   | + | assuming that $a_1+a_2 = 1$. As above, one finds by subtraction $(1-a_2)\Delta S_n + a_2 \Delta S_{n+1} = 0$, which leads  | 
|   | + | to $a_2 = - \Delta S_n/ \Delta^2 S_n$ and finally to $s = T_n = S_n+a_2 \Delta S_n$, which is the Aitken process again. It can  | 
|   | + | also be written as   | 
|   | + | $$            | 
|   | + |  T_n = \frac{\begin{vmatrix}                   | 
|   | + |  S_n & S_{n+1}\\                               | 
|   | + |  \Delta S_n & \Delta S_{n+1}\end{vmatrix}}     | 
|   | + |  {\begin{vmatrix} 1 & 1\\                      | 
|   | + |  \Delta S_n & \Delta S_{n+1}\end{vmatrix}}.\tag{a2}    | 
|   | + | $$  | 
|   | + | Most sequence transformations can be written  | 
|   | + | as a quotient of two determinants. As mentioned above, the kernel of a  | 
|   | + | transformation depends on an integer $k$. To indicate this, denote $T_n$  | 
|   | + | by $T_k^{(n)}$. Thus, the problem of the recursive computation of the $T_k^{(n)}$  | 
|   | + | without computing these determinants arises. It can be proved (see,  | 
|   | + | for example,  | 
|   | + | {{Cite|BrReZa}}, pp. 18–26) that, since these quantities can be  | 
|   | + | written as a quotient of two determinants, they can be recursively  | 
|   | + | computed, for increasing values of $k$, by a triangular recursive  | 
|   | + | scheme, which means that $T_{k+1}^{(n)}$ is obtained from $T_k^{(n)}$ and $T_k^{(n+1)}$. Such a  | 
|   | + | procedure is called an extrapolation algorithm. The converse of this  | 
|   | + | result is also true, namely that any array of numbers $\{T_k^{(n)}\}$ that can be  | 
|   | + | computed by a triangular recursive scheme can be written as a ratio of  | 
|   | + | two determinants.  | 
|   |  |   |  | 
| − | ===Example.===  | + | ==$E$-algorithm.==  | 
| − | Assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028041.png" /> satisfies, for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028043.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028044.png" />. Writing down (a1) with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028045.png" />, and subtracting one relation from the next one, gives
  | + | The most general extrapolation process currently  | 
|   | + | known is the $E$-algorithm. Its kernel is the set of sequences such  | 
|   | + | that $S_n = s + a_1g_1(n) + \cdots + a_k g_k(n)$ for all $n$, where the $(g_i(n))$ are known auxiliary sequences  | 
|   | + | which can depend on certain terms of the sequence $(S_n)$ itself. Solving  | 
|   | + | (a1), it is easy to see that   | 
|   | + | $$T_k^{(n)} = \frac{D_k^{(n)} [(S)]}{D_k^{(n)}[(1)]},\tag{a3}$$  | 
|   | + | where, for an arbitrary sequence  | 
|   | + | $(u) = (u_n)$,   | 
|   | + | $$D_k^{(n)}[(u)] = \begin{vmatrix} u_n & \dots & u_{n+k} \\  | 
|   | + | g_1(n) & \dots & g_1(n+k) \\  | 
|   | + | \vdots & \dots & \vdots \\  | 
|   | + | g_k(n) & \dots & g_k(n+k) \\  | 
|   | + | \end{vmatrix}  | 
|   | + | $$  | 
|   | + | and where $(S)$ denotes the sequence $(S_n)$ and $(1)$ the sequence  | 
|   | + | whose terms are all equal to one.  | 
|   |  |   |  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028046.png" /></td> </tr></table>
  | + | These quantities can be recursively computed by the $E$-algorithm,  | 
|   | + | whose rules are as follows: for $k, n = 0,1,\dots$,   | 
|   | + | $$T_{k+1}^{(n)} = T_k^{(n)} - \frac{\Delta T_k^{(n)}}{\Delta g_{k,k+1}^{(n)}}   | 
|   | + | g_{k,k+1}^{(n)}$$  | 
|   |  |   |  | 
| − | for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028047.png" />. Thus, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028048.png" />. Also, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028049.png" />, which gives <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028050.png" /> and finally
  | + | $$g_{k+1,i}^{(n)} = g_{k,i}^{(n)} - \frac{\Delta g_{k,i}^{(n)}}{\Delta g_{k,k+1}^{(n)}}   | 
|   | + | g_{k,k+1}^{(n)}, \quad i>k+1,$$  | 
|   | + | with $T_0^{(n)} = S_n$ and $g_{0,i}^{(n)} = g_i(n)$ and where the operator $\Delta$ acts on the upper  | 
|   | + | indices $n$.  | 
|   |  |   |  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028051.png" /></td> </tr></table>
  | + | For the $E$-algorithm it can be proved that if $S_n = S+a_1g_1(n) + a_2 g_2(n)+\cdots$, where the $(g_i)$ form  | 
| − |    | + | an asymptotic series (that is, $g_{i+1}(n) = o(g_i(n))$ when $n$ goes to infinity) and  | 
| − | which is nothing else but the [[Aitken Delta^2 process|Aitken <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028052.png" /> process]]. Another way of recovering this process is to assume that the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028053.png" /> satisfies, for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028054.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028055.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028056.png" />. So, the generality is not restricted by assuming that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028057.png" />. As above, one finds by subtraction <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028058.png" />, which leads to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028059.png" /> and finally to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028060.png" />, which is the Aitken process again. It can also be written as
  | + | under certain additional technical assumptions, then, for any fixed  | 
| − |    | + | value of $k$, $(T_{ k+1 }^{(n)})$ tends to $S$ faster than $(T_k^{(n)})$ as $n$ tends to  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028061.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
  | + | infinity. This result is quite important since it shows that, for  | 
| − |    | + | accelerating the convergence of a sequence $(S_n)$, it is necessary to  | 
| − | Most sequence transformations can be written as a quotient of two determinants. As mentioned above, the kernel of a transformation depends on an integer <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028062.png" />. To indicate this, denote <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028063.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028064.png" />. Thus, the problem of the recursive computation of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028065.png" /> without computing these determinants arises. It can be proved (see, for example, [[#References|[a2]]], pp. 18–26) that, since these quantities can be written as a quotient of two determinants, they can be recursively computed, for increasing values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028066.png" />, by a triangular recursive scheme, which means that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028067.png" /> is obtained from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028068.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028069.png" />. Such a procedure is called an extrapolation algorithm. The converse of this result is also true, namely that any array of numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028070.png" /> that can be computed by a triangular recursive scheme can be written as a ratio of two determinants.
  | + | know an asymptotic expansion of the error $S_n-S$. Thus, there is a close  | 
| − |    | + | connection between extrapolation and asymptotics, as explained in  | 
| − | ==<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028071.png" />-algorithm.==
  | + | {{Cite|Wa}}.  | 
| − | The most general extrapolation process currently known is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028072.png" />-algorithm. Its kernel is the set of sequences such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028073.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028074.png" />, where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028075.png" /> are known auxiliary sequences which can depend on certain terms of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028076.png" /> itself. Solving (a1), it is easy to see that
  |   | 
| − |    |   | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028077.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table>
  |   | 
| − |    |   | 
| − | where, for an arbitrary sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028078.png" />,
  |   | 
| − |    |   | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028079.png" /></td> </tr></table>
  |   | 
| − |    |   | 
| − | and where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028080.png" /> denotes the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028081.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028082.png" /> the sequence whose terms are all equal to one.
  |   | 
| − |    |   | 
| − | These quantities can be recursively computed by the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028083.png" />-algorithm, whose rules are as follows: for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028084.png" />,
  |   | 
| − |    |   | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028085.png" /></td> </tr></table>
  |   | 
| − |    |   | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028086.png" /></td> </tr></table>
  |   | 
| − |    |   | 
| − | with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028087.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028088.png" /> and where the operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028089.png" /> acts on the upper indices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028090.png" />.
  |   | 
| − |    |   | 
| − | For the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028091.png" />-algorithm it can be proved that if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028092.png" />, where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028093.png" /> form an asymptotic series (that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028094.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028095.png" /> goes to infinity) and under certain additional technical assumptions, then, for any fixed value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028096.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028097.png" /> tends to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028098.png" /> faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e12028099.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280100.png" /> tends to infinity. This result is quite important since it shows that, for accelerating the convergence of a sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280101.png" />, it is necessary to know an asymptotic expansion of the error <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280102.png" />. Thus, there is a close connection between extrapolation and asymptotics, as explained in [[#References|[a5]]].
  |   | 
|   |  |   |  | 
|   | ==Generalization.==  |   | ==Generalization.==  | 
| − | The Aitken process was generalized by D. Shanks, who considered a kernel of the form  | + | The Aitken process was generalized by D. Shanks,  | 
| − |    | + | who considered a kernel of the form    | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280103.png" /></td> </tr></table>
  | + | $$a_1(S_n-s) + \cdots + a_k(S_{n+k-1}-s) = 0$$  | 
| − |    | + | with $a_1+\cdots+a_k \ne 0$. The corresponding  | 
| − | with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280104.png" />. The corresponding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280105.png" /> are given by the ratio of determinants (a3) with, in this case, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280106.png" />. It is an extension of (a2). These <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280107.png" /> can be recursively computed by the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280109.png" />-algorithm of P. Wynn, whose rules are:  | + | $(T_k^{(n)})$ are given by the ratio of determinants (a3) with, in this case,  | 
| − |    | + | $g_i(n) = \Delta S_{n+i-1}$. It is an extension of (a2). These $(T_k^{(n)})$ can be recursively computed  | 
| − | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280110.png" /></td> </tr></table>
  | + | by the $\def\e{\varepsilon} \e$-algorithm of P. Wynn, whose rules are:    | 
| − |    | + | $$\e_{k+1}^{(n)} = \e_{k-1}^{(n+1)} + [ \e_{k}^{(n+1)} - \e_{k}^{(n)}]^{-1},\ k,n=0,1,\dots,$$  | 
| − | with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280111.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280112.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280113.png" />, and one obtains <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280114.png" />. The quantities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280115.png" /> are intermediate results. This algorithm is related to [[Padé approximation|Padé approximation]]. Indeed, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280116.png" /> is the sequence of partial sums of a series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280117.png" /> at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280118.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280119.png" />.  | + | with $\e_{-1}^{(n)} = 0$ and  | 
|   | + | $\e_{0}^{(n)} = S_n$ for $n=0,1,\dots$, and one obtains $\e_{2k}^{(n)} = T_k^{(n)}$. The quantities $\e_{2k+1}^{(n)}$ are intermediate  | 
|   | + | results. This algorithm is related to  | 
|   | + | [[Padé approximation|Padé approximation]]. Indeed, if $(S_n)$ is the  | 
|   | + | sequence of partial sums of a series $f$ at the point $t$, then $\e_{2k}^{(n)} = [(n+k)/k]_f(t)$.  | 
|   |  |   |  | 
| − | Among the well-known extrapolation algorithms, there is also the Richardson process (cf. also [[Richardson extrapolation|Richardson extrapolation]]), whose kernel is the set of sequences of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280120.png" /> where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280121.png" /> is an auxiliary known sequence. Thus, this process corresponds to polynomial extrapolation at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280122.png" />. The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280123.png" /> can again be written as (a3) with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280124.png" /> and they can be recursively computed by the Neville–Aitken scheme for constructing the interpolation polynomial.  | + | Among the well-known extrapolation algorithms, there is also the  | 
|   | + | Richardson process (cf. also  | 
|   | + | [[Richardson extrapolation|Richardson extrapolation]]), whose kernel  | 
|   | + | is the set of sequences of the form $S_n = s + a_1x_n + \cdots + a_k x_n^k$ where $(x_n)$ is an auxiliary  | 
|   | + | known sequence. Thus, this process corresponds to polynomial  | 
|   | + | extrapolation at the point $0$. The $T_k^{(n)}$ can again be written as (a3)  | 
|   | + | with $g_i(n) = x_k^i$ and they can be recursively computed by the Neville–Aitken  | 
|   | + | scheme for constructing the interpolation polynomial.  | 
|   |  |   |  | 
| − | Obviously, an extrapolation algorithm will provide good approximations of the limit <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280125.png" /> of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280126.png" /> or, in other words, the transformation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280127.png" /> will accelerate the convergence, if the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280128.png" /> in (a1) well describes the exact behaviour of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280129.png" />. This is the case, for example, in the [[Romberg method|Romberg method]] for accelerating the convergence of the sequence obtained by the [[Trapezium formula|trapezium formula]] for computing a definite integral. Indeed, using Euler–MacLaurin expansion (cf. [[Euler–MacLaurin formula|Euler–MacLaurin formula]]), it can be proved that the error can be written as a series in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280130.png" /> and the Romberg method is based on polynomial extrapolation at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280131.png" /> by a polynomial in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e120/e120280/e120280132.png" />. Note that, sometimes, extrapolation algorithms are able to sum diverging sequences and series.  | + | Obviously, an extrapolation algorithm will provide good approximations  | 
|   | + | of the limit $S$ of the sequence $(S_n)$ or, in other words, the  | 
|   | + | transformation $T$ will accelerate the convergence, if the function  | 
|   | + | $R$ in (a1) well describes the exact behaviour of the sequence  | 
|   | + | $(S_n)$. This is the case, for example, in the  | 
|   | + | [[Romberg method|Romberg method]] for accelerating the convergence of  | 
|   | + | the sequence obtained by the  | 
|   | + | [[Trapezium formula|trapezium formula]] for computing a definite  | 
|   | + | integral. Indeed, using Euler–MacLaurin expansion (cf.  | 
|   | + | [[Euler–MacLaurin formula|Euler–MacLaurin formula]]), it can be proved  | 
|   | + | that the error can be written as a series in $h^2$ and the Romberg  | 
|   | + | method is based on polynomial extrapolation at $0$ by a polynomial in  | 
|   | + | $h^2$. Note that, sometimes, extrapolation algorithms are able to sum  | 
|   | + | diverging sequences and series.  | 
|   |  |   |  | 
| − | There exist many other extrapolation processes. It is important to define many such processes since each of them is only able to accelerate the convergence of certain classes of sequences and, as has been proved by J.P. Delahaye and B. Germain-Bonne [[#References|[a4]]], a universal algorithm able to accelerate the convergence of all convergent sequences cannot exist. This is because this class is too large. Even for smaller classes, such as the class of monotonic sequences, such a universal algorithm cannot exist.  | + | There exist many other extrapolation processes. It is important to  | 
|   | + | define many such processes since each of them is only able to  | 
|   | + | accelerate the convergence of certain classes of sequences and, as has  | 
|   | + | been proved by J.P. Delahaye and B. Germain-Bonne  | 
|   | + | {{Cite|DeGe}}, a universal algorithm able to accelerate the  | 
|   | + | convergence of all convergent sequences cannot exist. This is because  | 
|   | + | this class is too large. Even for smaller classes, such as the class  | 
|   | + | of monotonic sequences, such a universal algorithm cannot exist.  | 
|   |  |   |  | 
|   | ==Vector sequences.==  |   | ==Vector sequences.==  | 
| − | Clearly, for accelerating the convergence of vector sequences it is possible to use a scalar extrapolation method for each component of the vectors. However, in practice, vector sequences are often generated by an iterative process and applying a scalar transformation separately on each component does not take into account the connections existing between the various components. Thus, it is better to use an extrapolation algorithm specially built for vector sequences. So, there exists vector variants of most scalar algorithms. Quite often, such processes are related to projection methods [[#References|[a1]]].  | + | Clearly, for accelerating the convergence of  | 
|   | + | vector sequences it is possible to use a scalar extrapolation method  | 
|   | + | for each component of the vectors. However, in practice, vector  | 
|   | + | sequences are often generated by an iterative process and applying a  | 
|   | + | scalar transformation separately on each component does not take into  | 
|   | + | account the connections existing between the various components. Thus,  | 
|   | + | it is better to use an extrapolation algorithm specially built for  | 
|   | + | vector sequences. So, there exists vector variants of most scalar  | 
|   | + | algorithms. Quite often, such processes are related to projection  | 
|   | + | methods  | 
|   | + | {{Cite|Br}}.  | 
|   |  |   |  | 
| − | On extrapolation methods, see [[#References|[a2]]], [[#References|[a3]]] and [[#References|[a7]]]. FORTRAN subroutines of many extrapolation algorithms can be found in [[#References|[a2]]]. Various applications are described in [[#References|[a6]]].  | + | On extrapolation methods, see  | 
|   | + | {{Cite|BrReZa}},  | 
|   | + | {{Cite|De}} and  | 
|   | + | {{Cite|Wi}}. FORTRAN subroutines of many extrapolation  | 
|   | + | algorithms can be found in  | 
|   | + | {{Cite|BrReZa}}. Various applications are described in  | 
|   | + | {{Cite|We}}.  | 
|   |  |   |  | 
|   | ====References====  |   | ====References====  | 
| − | <table><TR><TD valign="top">[a1]</TD> <TD valign="top">  C. Brezinski,   "Projection methods for systems of equations" , North-Holland  (1997)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  C. Brezinski,   M. Redivo Zaglia,   "Extrapolation methods. Theory and practice" , North-Holland  (1991)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  J.P. Delahaye,   "Sequence transformations" , Springer  (1988)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  J.P. Delahaye,   B. Germain–Bonne,   "Résultats négatifs en accélération de la convergence"  ''Numer. Math.'' , '''35'''  (1980)  pp. 443–457</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  G. Walz,   "Asymptotics and extrapolation" , Akad. Verlag  (1996)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  E.J. Weniger,   "Nonlinear sequence transformations for the acceleration of convergence and the summation of divergent series"  ''Comput. Physics Reports'' , '''10'''  (1989)  pp. 189–371</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  J. Wimp,   "Sequence transformations and their applications" , Acad. Press  (1981)</TD></TR></table>
  | + | {|  | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|Br}}||valign="top"| C. Brezinski, "Projection methods for systems of equations", North-Holland (1997)  {{MR|1616573}}  {{ZBL|0889.65023}}   | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|BrReZa}}||valign="top"| C. Brezinski, M. Redivo Zaglia, "Extrapolation methods. Theory and practice", North-Holland (1991)  {{MR|1140920}}  {{ZBL|0744.65004}}   | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|De}}||valign="top"| J.P. Delahaye, "Sequence transformations", Springer (1988)  {{MR|0953542}}  {{ZBL|0652.65004}}   | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|DeGe}}||valign="top"| J.P. Delahaye, B. Germain–Bonne, "Résultats négatifs en accélération de la convergence" ''Numer. Math.'', '''35''' (1980) pp. 443–457  {{MR|0593838}}  {{ZBL|0423.65003}}   | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|Wa}}||valign="top"| G. Walz, "Asymptotics and extrapolation", Akad. Verlag (1996)  {{MR|1386191}}  {{ZBL|0852.65002}} {{ZBL|0872.41014}}   | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|We}}||valign="top"| E.J. Weniger, "Nonlinear sequence transformations for the acceleration of convergence and the summation of divergent series" ''Comput. Physics Reports'', '''10''' (1989) pp. 189–371     | 
|   | + | |-  | 
|   | + | |valign="top"|{{Ref|Wi}}||valign="top"| J. Wimp, "Sequence transformations and their applications", Acad. Press (1981)  {{MR|0615250}}  {{ZBL|0566.47018}}   | 
|   | + | |-  | 
|   | + | |}  | 
2020 Mathematics Subject Classification: Primary: 65-02 Secondary: 41A5865B9965D30 [MSN][ZBL]
In
numerical analysis and in applied mathematics,
many methods produce sequences of numbers or vectors $(S_n)$ converging to
a limit $S$. This is the case in iterative methods but also when a
result $S(h)$ depends on a parameter $h$. An example is the
trapezium formula for approximating a definite
integral, or when a sequence of step-sizes $(h_n)$ is used,
thus leading to the sequence $(S_n=S(h_n))$. Quite often in practice, the
convergence of $(S_n)$ is slow and needs to be accelerated. For this
purpose, $(S_n)$ is transformed, by a sequence transformation $T$, into a
new sequence $(T_n)$ with the hope that $(T_n)$ will converge to the same
limit faster than $(S_n)$ or, in other words, that $T$ will accelerate the
convergence of $(S_n)$, which means 
$$\lim_{n\to\infty} \frac{\|T_n-S\|}{\|S_n-S\|} = 0.$$
Construction of a sequence transformation in the scalar case.
First, it is assumed that $(S_n)$
behaves as a certain function $R$ of $n$ depending on $k+1$ parameters
$a_1,\dots,a_k$ and $s$, and also, maybe, on some terms of the sequence $(S_n)$. These
parameters are determined by imposing the interpolation conditions
$$S_{n+i} = R(n+i,s,a_1,\dots,a_k),\quad i = 0,\dots,k.\tag{a1}$$
Then $s$ is taken as an approximation of the limit $S$ of the
sequence $(S_n)$. Obviously, $a_1,\dots,a_k$ and $s$, obtained as the solution of
(a1), depend on $n$. For that reason, $s$ will be denoted by $T_n$,
which defines the sequence transformation $T:(S_n) \to(T_n)$. If $(S_n)$ satisfies (a1)
for all $n$, where $s$ and the $a_i$ are constants independent of $n$,
then, by construction, $T_n = s$ for all $n$. Quite often, this condition is
also necessary. The set of sequences satisfying this condition is
called the kernel of the transformation $T$.
A sequence transformation constructed by such a procedure is called an
extrapolation method.
Example.
Assume that $(S_n)$ satisfies, for all $n$, $S_n = s + a_1a_2^n$ with
$a_2 \ne 1$. Writing down (a1) with $k=2$, and subtracting one relation from the
next one, gives 
$$\Delta S_{n+i} = S_{n+i+1} - S_{n+i} = a_1a_2^{n+i}(a_2-1) $$
for $i=0,1$. Thus, $a_2 = \Delta S_{n+1}/{\Delta S_n}$. Also, $a_1a_2^n = \Delta S_n/(a_2-1)$, which gives $a_1 a_2^n = (\Delta S_n)^2 /\Delta^2 S_n$
and finally 
$$s = S_n -a_1 a_2^n = T_n = S_n - \frac{(\Delta S_n)^2}{\Delta^2 S_n},$$
which is nothing else but the
Aitken $\Delta^2$ process. Another way of
recovering this process is to assume that the sequence $(S_n)$ satisfies,
for all $n$, $a_1(S_n-s) + a_2(S_{n+1}-s) = 0$ with $a_1+a_2 \ne 0$. So, the generality is not restricted by
assuming that $a_1+a_2 = 1$. As above, one finds by subtraction $(1-a_2)\Delta S_n + a_2 \Delta S_{n+1} = 0$, which leads
to $a_2 = - \Delta S_n/ \Delta^2 S_n$ and finally to $s = T_n = S_n+a_2 \Delta S_n$, which is the Aitken process again. It can
also be written as 
$$          
 T_n = \frac{\begin{vmatrix}                 
 S_n & S_{n+1}\\                             
 \Delta S_n & \Delta S_{n+1}\end{vmatrix}}   
 {\begin{vmatrix} 1 & 1\\                    
 \Delta S_n & \Delta S_{n+1}\end{vmatrix}}.\tag{a2}  
$$
Most sequence transformations can be written
as a quotient of two determinants. As mentioned above, the kernel of a
transformation depends on an integer $k$. To indicate this, denote $T_n$
by $T_k^{(n)}$. Thus, the problem of the recursive computation of the $T_k^{(n)}$
without computing these determinants arises. It can be proved (see,
for example,
[BrReZa], pp. 18–26) that, since these quantities can be
written as a quotient of two determinants, they can be recursively
computed, for increasing values of $k$, by a triangular recursive
scheme, which means that $T_{k+1}^{(n)}$ is obtained from $T_k^{(n)}$ and $T_k^{(n+1)}$. Such a
procedure is called an extrapolation algorithm. The converse of this
result is also true, namely that any array of numbers $\{T_k^{(n)}\}$ that can be
computed by a triangular recursive scheme can be written as a ratio of
two determinants.
$E$-algorithm.
The most general extrapolation process currently
known is the $E$-algorithm. Its kernel is the set of sequences such
that $S_n = s + a_1g_1(n) + \cdots + a_k g_k(n)$ for all $n$, where the $(g_i(n))$ are known auxiliary sequences
which can depend on certain terms of the sequence $(S_n)$ itself. Solving
(a1), it is easy to see that 
$$T_k^{(n)} = \frac{D_k^{(n)} [(S)]}{D_k^{(n)}[(1)]},\tag{a3}$$
where, for an arbitrary sequence
$(u) = (u_n)$, 
$$D_k^{(n)}[(u)] = \begin{vmatrix} u_n & \dots & u_{n+k} \\
g_1(n) & \dots & g_1(n+k) \\
\vdots & \dots & \vdots \\
g_k(n) & \dots & g_k(n+k) \\
\end{vmatrix}
$$
and where $(S)$ denotes the sequence $(S_n)$ and $(1)$ the sequence
whose terms are all equal to one.
These quantities can be recursively computed by the $E$-algorithm,
whose rules are as follows: for $k, n = 0,1,\dots$, 
$$T_{k+1}^{(n)} = T_k^{(n)} - \frac{\Delta T_k^{(n)}}{\Delta g_{k,k+1}^{(n)}} 
g_{k,k+1}^{(n)}$$
$$g_{k+1,i}^{(n)} = g_{k,i}^{(n)} - \frac{\Delta g_{k,i}^{(n)}}{\Delta g_{k,k+1}^{(n)}} 
g_{k,k+1}^{(n)}, \quad i>k+1,$$
with $T_0^{(n)} = S_n$ and $g_{0,i}^{(n)} = g_i(n)$ and where the operator $\Delta$ acts on the upper
indices $n$.
For the $E$-algorithm it can be proved that if $S_n = S+a_1g_1(n) + a_2 g_2(n)+\cdots$, where the $(g_i)$ form
an asymptotic series (that is, $g_{i+1}(n) = o(g_i(n))$ when $n$ goes to infinity) and
under certain additional technical assumptions, then, for any fixed
value of $k$, $(T_{ k+1 }^{(n)})$ tends to $S$ faster than $(T_k^{(n)})$ as $n$ tends to
infinity. This result is quite important since it shows that, for
accelerating the convergence of a sequence $(S_n)$, it is necessary to
know an asymptotic expansion of the error $S_n-S$. Thus, there is a close
connection between extrapolation and asymptotics, as explained in
[Wa].
Generalization.
The Aitken process was generalized by D. Shanks,
who considered a kernel of the form 
$$a_1(S_n-s) + \cdots + a_k(S_{n+k-1}-s) = 0$$
with $a_1+\cdots+a_k \ne 0$. The corresponding
$(T_k^{(n)})$ are given by the ratio of determinants (a3) with, in this case,
$g_i(n) = \Delta S_{n+i-1}$. It is an extension of (a2). These $(T_k^{(n)})$ can be recursively computed
by the $\def\e{\varepsilon} \e$-algorithm of P. Wynn, whose rules are: 
$$\e_{k+1}^{(n)} = \e_{k-1}^{(n+1)} + [ \e_{k}^{(n+1)} - \e_{k}^{(n)}]^{-1},\ k,n=0,1,\dots,$$
with $\e_{-1}^{(n)} = 0$ and
$\e_{0}^{(n)} = S_n$ for $n=0,1,\dots$, and one obtains $\e_{2k}^{(n)} = T_k^{(n)}$. The quantities $\e_{2k+1}^{(n)}$ are intermediate
results. This algorithm is related to
Padé approximation. Indeed, if $(S_n)$ is the
sequence of partial sums of a series $f$ at the point $t$, then $\e_{2k}^{(n)} = [(n+k)/k]_f(t)$.
Among the well-known extrapolation algorithms, there is also the
Richardson process (cf. also
Richardson extrapolation), whose kernel
is the set of sequences of the form $S_n = s + a_1x_n + \cdots + a_k x_n^k$ where $(x_n)$ is an auxiliary
known sequence. Thus, this process corresponds to polynomial
extrapolation at the point $0$. The $T_k^{(n)}$ can again be written as (a3)
with $g_i(n) = x_k^i$ and they can be recursively computed by the Neville–Aitken
scheme for constructing the interpolation polynomial.
Obviously, an extrapolation algorithm will provide good approximations
of the limit $S$ of the sequence $(S_n)$ or, in other words, the
transformation $T$ will accelerate the convergence, if the function
$R$ in (a1) well describes the exact behaviour of the sequence
$(S_n)$. This is the case, for example, in the
Romberg method for accelerating the convergence of
the sequence obtained by the
trapezium formula for computing a definite
integral. Indeed, using Euler–MacLaurin expansion (cf.
Euler–MacLaurin formula), it can be proved
that the error can be written as a series in $h^2$ and the Romberg
method is based on polynomial extrapolation at $0$ by a polynomial in
$h^2$. Note that, sometimes, extrapolation algorithms are able to sum
diverging sequences and series.
There exist many other extrapolation processes. It is important to
define many such processes since each of them is only able to
accelerate the convergence of certain classes of sequences and, as has
been proved by J.P. Delahaye and B. Germain-Bonne
[DeGe], a universal algorithm able to accelerate the
convergence of all convergent sequences cannot exist. This is because
this class is too large. Even for smaller classes, such as the class
of monotonic sequences, such a universal algorithm cannot exist.
Vector sequences.
Clearly, for accelerating the convergence of
vector sequences it is possible to use a scalar extrapolation method
for each component of the vectors. However, in practice, vector
sequences are often generated by an iterative process and applying a
scalar transformation separately on each component does not take into
account the connections existing between the various components. Thus,
it is better to use an extrapolation algorithm specially built for
vector sequences. So, there exists vector variants of most scalar
algorithms. Quite often, such processes are related to projection
methods
[Br].
On extrapolation methods, see
[BrReZa],
[De] and
[Wi]. FORTRAN subroutines of many extrapolation
algorithms can be found in
[BrReZa]. Various applications are described in
[We].
References
| [Br] | 
C. Brezinski, "Projection methods for systems of equations", North-Holland (1997)  MR1616573  Zbl 0889.65023
 | 
| [BrReZa] | 
C. Brezinski, M. Redivo Zaglia, "Extrapolation methods. Theory and practice", North-Holland (1991)  MR1140920  Zbl 0744.65004
 | 
| [De] | 
J.P. Delahaye, "Sequence transformations", Springer (1988)  MR0953542  Zbl 0652.65004
 | 
| [DeGe] | 
J.P. Delahaye, B. Germain–Bonne, "Résultats négatifs en accélération de la convergence" Numer. Math., 35 (1980) pp. 443–457  MR0593838  Zbl 0423.65003
 | 
| [Wa] | 
G. Walz, "Asymptotics and extrapolation", Akad. Verlag (1996)  MR1386191  Zbl 0852.65002 Zbl 0872.41014
 | 
| [We] | 
E.J. Weniger, "Nonlinear sequence transformations for the acceleration of convergence and the summation of divergent series" Comput. Physics Reports, 10 (1989) pp. 189–371
 | 
| [Wi] | 
J. Wimp, "Sequence transformations and their applications", Acad. Press (1981)  MR0615250  Zbl 0566.47018
 |