Namespaces
Variants
Actions

Difference between revisions of "Aitken Delta^2 process"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (AUTOMATIC EDIT (latexlist): Replaced 100 formulas out of 100 by TEX code with an average confidence of 2.0 and a minimal confidence of 2.0.)
Line 1: Line 1:
 +
<!--This article has been texified automatically. Since there was no Nroff source code for this article,
 +
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
 +
was used.
 +
If the TeX and formula formatting is correct, please remove this message and the {{TEX|semi-auto}} category.
 +
 +
Out of 100 formulas, 100 were replaced by TEX code.-->
 +
 +
{{TEX|semi-auto}}{{TEX|done}}
 
One of the most famous methods for accelerating the convergence of a given sequence.
 
One of the most famous methods for accelerating the convergence of a given sequence.
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201802.png" /> be a sequence of numbers converging to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201803.png" />. The Aitken <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201804.png" /> process consists of transforming <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201805.png" /> into the new sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201806.png" /> defined, for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201807.png" />, by
+
Let $( S _ { n } )$ be a sequence of numbers converging to $S$. The Aitken $\Delta ^ { 2 }$ process consists of transforming $( S _ { n } )$ into the new sequence $( T _ { n } )$ defined, for $n = 0,1 , \dots$, by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201808.png" /></td> </tr></table>
+
\begin{equation*} T _ { n } = \frac { S _ { n } S _ { n + 2 } - S _ { n + 1 } ^ { 2 } } { S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n } } = S _ { n } - \frac { \Delta S _ { n } } { \Delta ^ { 2 } S _ { n } }, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a1201809.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018010.png" />.
+
where $\Delta S _ { n } = S _ { n + 1 }  - S _ { n }$ and $\Delta ^ { 2 } S _ { n } = \Delta S _ { n + 1 } - \Delta S _ { n } = S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n }$.
  
The first formula above is numerically unstable since, when the terms are close to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018011.png" />, cancellation arises in the numerator and in the denominator. Of course, such a cancellation also occurs in the second formula, however only in the computation of a correcting term to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018012.png" />. Thus, cancellation appears as a second-order error and it follows that the second formula is more stable than the first one, which is only used for theoretical purposes.
+
The first formula above is numerically unstable since, when the terms are close to $S$, cancellation arises in the numerator and in the denominator. Of course, such a cancellation also occurs in the second formula, however only in the computation of a correcting term to $S _ { n }$. Thus, cancellation appears as a second-order error and it follows that the second formula is more stable than the first one, which is only used for theoretical purposes.
  
 
Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century.
 
Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century.
  
An important algebraic property of an [[Extrapolation algorithm|extrapolation algorithm]], such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018013.png" /> process is the set of sequences of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018014.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018015.png" />, with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018016.png" /> or, in other words, such that, for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018017.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018018.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018019.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018020.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018021.png" /> converges to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018022.png" />. However, it must be noted that this result is true even if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018023.png" />, that is, even if the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018024.png" /> is diverging. In other words, the kernel is the set of sequences such that, for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018025.png" />,
+
An important algebraic property of an [[Extrapolation algorithm|extrapolation algorithm]], such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken $\Delta ^ { 2 }$ process is the set of sequences of the form $S _ { n } = S + \alpha \lambda ^ { n }$ for $n = 0,1 , \dots$, with $\lambda \neq 0,1$ or, in other words, such that, for all $n$, $a _ { 1 } ( S _ { n } - S ) + a _ { 2 } ( S _ { n + 1 } - S ) = 0$ with $a _ { 1 } + a _ { 2 } \neq 0$. If $| \lambda | &lt; 1$, $( S _ { n } )$ converges to $S$. However, it must be noted that this result is true even if $\lambda | &gt; 1$, that is, even if the sequence $( S _ { n } )$ is diverging. In other words, the kernel is the set of sequences such that, for all $n$,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018026.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+
\begin{equation} \tag{a1} \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda \neq 0,1. \end{equation}
  
If the Aitken process is applied to such a sequence, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018027.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018028.png" />.
+
If the Aitken process is applied to such a sequence, then $T _ { n } = S$ for all $n$.
  
This result also shows that the Aitken process is an [[Extrapolation algorithm|extrapolation algorithm]]. Indeed, it consists of computing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018029.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018030.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018031.png" /> such that the interpolation conditions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018032.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018033.png" />, are satisfied.
+
This result also shows that the Aitken process is an [[Extrapolation algorithm|extrapolation algorithm]]. Indeed, it consists of computing $T _ { n }$, $\alpha$ and $\lambda$ such that the interpolation conditions $S _ { n + i } = T _ { n } + \alpha \lambda ^ { n + i }$, $i = 0,1,2$, are satisfied.
  
===Convergence of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018034.png" />.===
+
===Convergence of the sequence $( T _ { n } )$.===
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018035.png" /> is an arbitrary convergent sequence, the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018036.png" /> obtained by the Aitken process can, in some cases, be non-convergent. Examples are known where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018037.png" /> has two cluster points. However, if the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018038.png" /> converges, then its limit is also <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018039.png" />, the limit of the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018040.png" />. It can be proved that if there are <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018041.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018042.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018043.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018044.png" />, such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018045.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018046.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018047.png" /> converges to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018048.png" />.
+
If $( S _ { n } )$ is an arbitrary convergent sequence, the sequence $( T _ { n } )$ obtained by the Aitken process can, in some cases, be non-convergent. Examples are known where $( T _ { n } )$ has two cluster points. However, if the sequence $( T _ { n } )$ converges, then its limit is also $S$, the limit of the sequence $( S _ { n } )$. It can be proved that if there are $a$, $b$ and $N$, $a &lt; 1 &lt; b$, such that $\Delta S _ { n + 1 } / \Delta S _ { n } \notin [ a , b ]$ for all $n \geq N$, then $( T _ { n } )$ converges to $S$.
  
 
===Convergence acceleration properties of the Aitken process.===
 
===Convergence acceleration properties of the Aitken process.===
The problem here is to give conditions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018049.png" /> such that
+
The problem here is to give conditions on $( S _ { n } )$ such that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018050.png" /></td> </tr></table>
+
\begin{equation*} \operatorname { lim } _ { n \rightarrow \infty } \frac { T _ { n } - S } { S _ { n } - S } = 0. \end{equation*}
  
In that case, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018051.png" /> is said to converge faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018052.png" /> or, in other words, that the Aitken process accelerates the convergence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018053.png" />.
+
In that case, $( T _ { n } )$ is said to converge faster than $( S _ { n } )$ or, in other words, that the Aitken process accelerates the convergence of $( S _ { n } )$.
  
Intuitively, it is easy to understand that if the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018054.png" /> is not too far away from a sequence satisfying (a1), then its convergence will be accelerated. Indeed, if there is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018055.png" /> such that
+
Intuitively, it is easy to understand that if the sequence $( S _ { n } )$ is not too far away from a sequence satisfying (a1), then its convergence will be accelerated. Indeed, if there is a $\lambda \neq 1$ such that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018056.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+
\begin{equation} \tag{a2} \operatorname { lim } _ { n \rightarrow \infty } \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda, \end{equation}
  
then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018057.png" /> will converge faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018058.png" />. Sequences satisfying (a2) are called linear. So, the Aitken process accelerates the convergence of the set of linear sequences. Moreover, if, in addition, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018059.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018060.png" /> converges faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018061.png" />. The condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018062.png" /> is not a very restrictive one, since, when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018063.png" />, the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018064.png" /> already converges sufficiently fast and does not need to be accelerated. Note that it is important to prove the acceleration with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018065.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018066.png" /> as large as possible (in general, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018067.png" /> corresponds to the last index used in the expression of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018068.png" /> that is, for the Aitken process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018069.png" />) since in certain cases it it possible that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018070.png" /> converges faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018071.png" /> but not faster than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018072.png" /> for some values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018073.png" />. The Aitken process is optimal for accelerating linear sequences, which means that it is not possible to accelerate the convergence of all linear sequences by a process using less than three successive terms of the sequence, and that the Aitken process is the only process using three terms that is able to do so [[#References|[a2]]]. It is the preceding acceleration result which makes the Aitken <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018074.png" /> process so popular, since many sequences coming out of well-known numerical algorithms satisfy (a2). This is, in particular, the case for the Rayleigh quotient method for computing the dominant eigenvalue of a matrix, for the Bernoulli method for obtaining the dominant zero of a polynomial or for fixed-point iterations with linear convergence.
+
then $( T _ { n } )$ will converge faster than $( S _ { n  + 1} )$. Sequences satisfying (a2) are called linear. So, the Aitken process accelerates the convergence of the set of linear sequences. Moreover, if, in addition, $\lambda \neq 0$, then $( T _ { n } )$ converges faster than $( S _ { n  + 2} )$. The condition $\lambda \neq 0$ is not a very restrictive one, since, when $\lambda = 0$, the sequence $( S _ { n } )$ already converges sufficiently fast and does not need to be accelerated. Note that it is important to prove the acceleration with respect to $( S _ { n+m }  )$ with $m$ as large as possible (in general, $m$ corresponds to the last index used in the expression of $T _ { n }$ that is, for the Aitken process $m = 2$) since in certain cases it it possible that $( T _ { n } )$ converges faster than $( S _ { n+m }  )$ but not faster than $( S _ { n + m + 1 } )$ for some values of $m$. The Aitken process is optimal for accelerating linear sequences, which means that it is not possible to accelerate the convergence of all linear sequences by a process using less than three successive terms of the sequence, and that the Aitken process is the only process using three terms that is able to do so [[#References|[a2]]]. It is the preceding acceleration result which makes the Aitken $\Delta ^ { 2 }$ process so popular, since many sequences coming out of well-known numerical algorithms satisfy (a2). This is, in particular, the case for the Rayleigh quotient method for computing the dominant eigenvalue of a matrix, for the Bernoulli method for obtaining the dominant zero of a polynomial or for fixed-point iterations with linear convergence.
  
The Aitken process is also able to accelerate the convergence of some sequences for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018075.png" /> in (a2). Such sequences are called logarithmic. They converge more slowly than the linear ones and they are the most difficult sequences to accelerate. Note that an algorithm able to accelerate the convergence of all logarithmic sequences cannot exist.
+
The Aitken process is also able to accelerate the convergence of some sequences for which $\lambda = 1$ in (a2). Such sequences are called logarithmic. They converge more slowly than the linear ones and they are the most difficult sequences to accelerate. Note that an algorithm able to accelerate the convergence of all logarithmic sequences cannot exist.
  
If the Aitken process is applied to the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018076.png" /> of partial sums of a series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018077.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018078.png" /> is identical to the Padé approximant <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018079.png" /> (cf. also [[Padé approximation|Padé approximation]]). For example, apply the Aitken process to the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018080.png" /> of partial sums of the series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018081.png" />. It is well known that it converges for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018082.png" />. So, for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018083.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018084.png" /> terms of the series are needed to obtain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018085.png" /> with a precision of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018086.png" />. Applying the Aitken process to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018087.png" />, then again to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018088.png" /> and so on (a procedure called the iterated <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018090.png" /> process), this precision is achieved with only <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018091.png" /> terms. Quite similar results can be obtained for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018092.png" />, in which case the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018093.png" /> is diverging.
+
If the Aitken process is applied to the sequence $S _ { n } = \sum _ { i = 0 } ^ { n } c _ { i } t ^ { i }$ of partial sums of a series $f$, then $T _ { n }$ is identical to the Padé approximant $[ n / 1 ]_{ f } ( t )$ (cf. also [[Padé approximation|Padé approximation]]). For example, apply the Aitken process to the sequence $( S _ { n } )$ of partial sums of the series $\operatorname { ln } ( 1 + t ) = t - t ^ { 2 } / 2 + t ^ { 3 } / 3 - \dots$. It is well known that it converges for $- 1 &lt; t \leq 1$. So, for $t = 1$, $10 ^ { 16 }$ terms of the series are needed to obtain $\operatorname { ln } 2$ with a precision of $10 ^ { - 16 }$. Applying the Aitken process to $( S _ { n } )$, then again to $( T _ { n } )$ and so on (a procedure called the iterated $\Delta ^ { 2 }$ process), this precision is achieved with only $23$ terms. Quite similar results can be obtained for $t = 2$, in which case the sequence $( S _ { n } )$ is diverging.
  
There exist several generalizations of the Aitken <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018094.png" /> process for scalar sequences, the most well-known being the Shanks transformation, which is usually implemented via the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018096.png" />-algorithm of P. Wynn. There are also vector generalizations of the Aitken process, adapted more specifically to the acceleration of vector sequences.
+
There exist several generalizations of the Aitken $\Delta ^ { 2 }$ process for scalar sequences, the most well-known being the Shanks transformation, which is usually implemented via the $\varepsilon$-algorithm of P. Wynn. There are also vector generalizations of the Aitken process, adapted more specifically to the acceleration of vector sequences.
  
The Aitken process also leads to new methods in [[Numerical analysis|numerical analysis]]. For example, for solving the fixed-point problem <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018097.png" />, consider the following method. It consists in applying the Aitken process to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018098.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a12018099.png" />:
+
The Aitken process also leads to new methods in [[Numerical analysis|numerical analysis]]. For example, for solving the fixed-point problem $x = F ( x )$, consider the following method. It consists in applying the Aitken process to $u_{0},u_{1}$ and $u_2$:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a120180100.png" /></td> </tr></table>
+
\begin{equation*} u _ { 0 } = x _ { n }, \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a120180101.png" /></td> </tr></table>
+
\begin{equation*} u _ { 1 } = F ( u _ { 0 } ) , u _ { 2 } = F ( u _ { 1 } ), \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a120180102.png" /></td> </tr></table>
+
\begin{equation*} x _ { n + 1 } = u _ { 0 } - \frac { \Delta u _ { 0 } } { \Delta ^ { 2 } u _ { 0 } }. \end{equation*}
  
This is a method due to J.F. Steffensen and its convergence is quadratic (as in the [[Newton method|Newton method]]) under the assumption that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/a/a120/a120180/a120180103.png" /> (same assumption as in Newton's method).
+
This is a method due to J.F. Steffensen and its convergence is quadratic (as in the [[Newton method|Newton method]]) under the assumption that $F ^ { \prime } ( x ) \neq 1$ (same assumption as in Newton's method).
  
 
For all these convergence acceleration methods, see [[#References|[a1]]], [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]].
 
For all these convergence acceleration methods, see [[#References|[a1]]], [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]].
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  C. Brezinski,  M. Redivo Zaglia,  "Extrapolation methods. Theory and practice" , North-Holland  (1991)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  J.P. Delahaye,  "Sequence transformations" , Springer  (1988)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  G. Walz,  "Asymptotics and extrapolation" , Akad. Verlag  (1996)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  J. Wimp,  "Sequence transformations and their applications" , Acad. Press  (1981)</TD></TR></table>
+
<table><tr><td valign="top">[a1]</td> <td valign="top">  C. Brezinski,  M. Redivo Zaglia,  "Extrapolation methods. Theory and practice" , North-Holland  (1991)</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  J.P. Delahaye,  "Sequence transformations" , Springer  (1988)</td></tr><tr><td valign="top">[a3]</td> <td valign="top">  G. Walz,  "Asymptotics and extrapolation" , Akad. Verlag  (1996)</td></tr><tr><td valign="top">[a4]</td> <td valign="top">  J. Wimp,  "Sequence transformations and their applications" , Acad. Press  (1981)</td></tr></table>

Revision as of 17:00, 1 July 2020

One of the most famous methods for accelerating the convergence of a given sequence.

Let $( S _ { n } )$ be a sequence of numbers converging to $S$. The Aitken $\Delta ^ { 2 }$ process consists of transforming $( S _ { n } )$ into the new sequence $( T _ { n } )$ defined, for $n = 0,1 , \dots$, by

\begin{equation*} T _ { n } = \frac { S _ { n } S _ { n + 2 } - S _ { n + 1 } ^ { 2 } } { S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n } } = S _ { n } - \frac { \Delta S _ { n } } { \Delta ^ { 2 } S _ { n } }, \end{equation*}

where $\Delta S _ { n } = S _ { n + 1 } - S _ { n }$ and $\Delta ^ { 2 } S _ { n } = \Delta S _ { n + 1 } - \Delta S _ { n } = S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n }$.

The first formula above is numerically unstable since, when the terms are close to $S$, cancellation arises in the numerator and in the denominator. Of course, such a cancellation also occurs in the second formula, however only in the computation of a correcting term to $S _ { n }$. Thus, cancellation appears as a second-order error and it follows that the second formula is more stable than the first one, which is only used for theoretical purposes.

Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century.

An important algebraic property of an extrapolation algorithm, such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken $\Delta ^ { 2 }$ process is the set of sequences of the form $S _ { n } = S + \alpha \lambda ^ { n }$ for $n = 0,1 , \dots$, with $\lambda \neq 0,1$ or, in other words, such that, for all $n$, $a _ { 1 } ( S _ { n } - S ) + a _ { 2 } ( S _ { n + 1 } - S ) = 0$ with $a _ { 1 } + a _ { 2 } \neq 0$. If $| \lambda | < 1$, $( S _ { n } )$ converges to $S$. However, it must be noted that this result is true even if $\lambda | > 1$, that is, even if the sequence $( S _ { n } )$ is diverging. In other words, the kernel is the set of sequences such that, for all $n$,

\begin{equation} \tag{a1} \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda \neq 0,1. \end{equation}

If the Aitken process is applied to such a sequence, then $T _ { n } = S$ for all $n$.

This result also shows that the Aitken process is an extrapolation algorithm. Indeed, it consists of computing $T _ { n }$, $\alpha$ and $\lambda$ such that the interpolation conditions $S _ { n + i } = T _ { n } + \alpha \lambda ^ { n + i }$, $i = 0,1,2$, are satisfied.

Convergence of the sequence $( T _ { n } )$.

If $( S _ { n } )$ is an arbitrary convergent sequence, the sequence $( T _ { n } )$ obtained by the Aitken process can, in some cases, be non-convergent. Examples are known where $( T _ { n } )$ has two cluster points. However, if the sequence $( T _ { n } )$ converges, then its limit is also $S$, the limit of the sequence $( S _ { n } )$. It can be proved that if there are $a$, $b$ and $N$, $a < 1 < b$, such that $\Delta S _ { n + 1 } / \Delta S _ { n } \notin [ a , b ]$ for all $n \geq N$, then $( T _ { n } )$ converges to $S$.

Convergence acceleration properties of the Aitken process.

The problem here is to give conditions on $( S _ { n } )$ such that

\begin{equation*} \operatorname { lim } _ { n \rightarrow \infty } \frac { T _ { n } - S } { S _ { n } - S } = 0. \end{equation*}

In that case, $( T _ { n } )$ is said to converge faster than $( S _ { n } )$ or, in other words, that the Aitken process accelerates the convergence of $( S _ { n } )$.

Intuitively, it is easy to understand that if the sequence $( S _ { n } )$ is not too far away from a sequence satisfying (a1), then its convergence will be accelerated. Indeed, if there is a $\lambda \neq 1$ such that

\begin{equation} \tag{a2} \operatorname { lim } _ { n \rightarrow \infty } \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda, \end{equation}

then $( T _ { n } )$ will converge faster than $( S _ { n + 1} )$. Sequences satisfying (a2) are called linear. So, the Aitken process accelerates the convergence of the set of linear sequences. Moreover, if, in addition, $\lambda \neq 0$, then $( T _ { n } )$ converges faster than $( S _ { n + 2} )$. The condition $\lambda \neq 0$ is not a very restrictive one, since, when $\lambda = 0$, the sequence $( S _ { n } )$ already converges sufficiently fast and does not need to be accelerated. Note that it is important to prove the acceleration with respect to $( S _ { n+m } )$ with $m$ as large as possible (in general, $m$ corresponds to the last index used in the expression of $T _ { n }$ that is, for the Aitken process $m = 2$) since in certain cases it it possible that $( T _ { n } )$ converges faster than $( S _ { n+m } )$ but not faster than $( S _ { n + m + 1 } )$ for some values of $m$. The Aitken process is optimal for accelerating linear sequences, which means that it is not possible to accelerate the convergence of all linear sequences by a process using less than three successive terms of the sequence, and that the Aitken process is the only process using three terms that is able to do so [a2]. It is the preceding acceleration result which makes the Aitken $\Delta ^ { 2 }$ process so popular, since many sequences coming out of well-known numerical algorithms satisfy (a2). This is, in particular, the case for the Rayleigh quotient method for computing the dominant eigenvalue of a matrix, for the Bernoulli method for obtaining the dominant zero of a polynomial or for fixed-point iterations with linear convergence.

The Aitken process is also able to accelerate the convergence of some sequences for which $\lambda = 1$ in (a2). Such sequences are called logarithmic. They converge more slowly than the linear ones and they are the most difficult sequences to accelerate. Note that an algorithm able to accelerate the convergence of all logarithmic sequences cannot exist.

If the Aitken process is applied to the sequence $S _ { n } = \sum _ { i = 0 } ^ { n } c _ { i } t ^ { i }$ of partial sums of a series $f$, then $T _ { n }$ is identical to the Padé approximant $[ n / 1 ]_{ f } ( t )$ (cf. also Padé approximation). For example, apply the Aitken process to the sequence $( S _ { n } )$ of partial sums of the series $\operatorname { ln } ( 1 + t ) = t - t ^ { 2 } / 2 + t ^ { 3 } / 3 - \dots$. It is well known that it converges for $- 1 < t \leq 1$. So, for $t = 1$, $10 ^ { 16 }$ terms of the series are needed to obtain $\operatorname { ln } 2$ with a precision of $10 ^ { - 16 }$. Applying the Aitken process to $( S _ { n } )$, then again to $( T _ { n } )$ and so on (a procedure called the iterated $\Delta ^ { 2 }$ process), this precision is achieved with only $23$ terms. Quite similar results can be obtained for $t = 2$, in which case the sequence $( S _ { n } )$ is diverging.

There exist several generalizations of the Aitken $\Delta ^ { 2 }$ process for scalar sequences, the most well-known being the Shanks transformation, which is usually implemented via the $\varepsilon$-algorithm of P. Wynn. There are also vector generalizations of the Aitken process, adapted more specifically to the acceleration of vector sequences.

The Aitken process also leads to new methods in numerical analysis. For example, for solving the fixed-point problem $x = F ( x )$, consider the following method. It consists in applying the Aitken process to $u_{0},u_{1}$ and $u_2$:

\begin{equation*} u _ { 0 } = x _ { n }, \end{equation*}

\begin{equation*} u _ { 1 } = F ( u _ { 0 } ) , u _ { 2 } = F ( u _ { 1 } ), \end{equation*}

\begin{equation*} x _ { n + 1 } = u _ { 0 } - \frac { \Delta u _ { 0 } } { \Delta ^ { 2 } u _ { 0 } }. \end{equation*}

This is a method due to J.F. Steffensen and its convergence is quadratic (as in the Newton method) under the assumption that $F ^ { \prime } ( x ) \neq 1$ (same assumption as in Newton's method).

For all these convergence acceleration methods, see [a1], [a2], [a3], [a4].

References

[a1] C. Brezinski, M. Redivo Zaglia, "Extrapolation methods. Theory and practice" , North-Holland (1991)
[a2] J.P. Delahaye, "Sequence transformations" , Springer (1988)
[a3] G. Walz, "Asymptotics and extrapolation" , Akad. Verlag (1996)
[a4] J. Wimp, "Sequence transformations and their applications" , Acad. Press (1981)
How to Cite This Entry:
Aitken Delta^2 process. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Aitken_Delta%5E2_process&oldid=50345
This article was adapted from an original article by C. Brezinski (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article