Difference between revisions of "Aitken Delta^2 process"
(Importing text file) |
(latex details) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | <!--This article has been texified automatically. Since there was no Nroff source code for this article, | ||
+ | the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist | ||
+ | was used. | ||
+ | If the TeX and formula formatting is correct, please remove this message and the {{TEX|semi-auto}} category. | ||
+ | |||
+ | Out of 100 formulas, 100 were replaced by TEX code.--> | ||
+ | |||
+ | {{TEX|semi-auto}}{{TEX|done}} | ||
One of the most famous methods for accelerating the convergence of a given sequence. | One of the most famous methods for accelerating the convergence of a given sequence. | ||
− | Let | + | Let $( S _ { n } )$ be a sequence of numbers converging to $S$. The Aitken $\Delta ^ { 2 }$ process consists of transforming $( S _ { n } )$ into the new sequence $( T _ { n } )$ defined, for $n = 0,1 , \dots$, by |
− | + | \begin{equation*} T _ { n } = \frac { S _ { n } S _ { n + 2 } - S _ { n + 1 } ^ { 2 } } { S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n } } = S _ { n } - \frac { \Delta S _ { n } } { \Delta ^ { 2 } S _ { n } }, \end{equation*} | |
− | where | + | where $\Delta S _ { n } = S _ { n + 1 } - S _ { n }$ and $\Delta ^ { 2 } S _ { n } = \Delta S _ { n + 1 } - \Delta S _ { n } = S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n }$. |
− | The first formula above is numerically unstable since, when the terms are close to | + | The first formula above is numerically unstable since, when the terms are close to $S$, cancellation arises in the numerator and in the denominator. Of course, such a cancellation also occurs in the second formula, however only in the computation of a correcting term to $S _ { n }$. Thus, cancellation appears as a second-order error and it follows that the second formula is more stable than the first one, which is only used for theoretical purposes. |
Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century. | Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century. | ||
− | An important algebraic property of an [[Extrapolation algorithm|extrapolation algorithm]], such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken | + | An important algebraic property of an [[Extrapolation algorithm|extrapolation algorithm]], such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken $\Delta ^ { 2 }$ process is the set of sequences of the form $S _ { n } = S + \alpha \lambda ^ { n }$ for $n = 0,1 , \dots$, with $\lambda \neq 0,1$ or, in other words, such that, for all $n$, $a _ { 1 } ( S _ { n } - S ) + a _ { 2 } ( S _ { n + 1 } - S ) = 0$ with $a _ { 1 } + a _ { 2 } \neq 0$. If $| \lambda | < 1$, $( S _ { n } )$ converges to $S$. However, it must be noted that this result is true even if $\lambda | > 1$, that is, even if the sequence $( S _ { n } )$ is diverging. In other words, the kernel is the set of sequences such that, for all $n$, |
− | + | \begin{equation} \tag{a1} \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda \neq 0,1. \end{equation} | |
− | If the Aitken process is applied to such a sequence, then | + | If the Aitken process is applied to such a sequence, then $T _ { n } = S$ for all $n$. |
− | This result also shows that the Aitken process is an [[Extrapolation algorithm|extrapolation algorithm]]. Indeed, it consists of computing | + | This result also shows that the Aitken process is an [[Extrapolation algorithm|extrapolation algorithm]]. Indeed, it consists of computing $T _ { n }$, $\alpha$ and $\lambda$ such that the interpolation conditions $S _ { n + i } = T _ { n } + \alpha \lambda ^ { n + i }$, $i = 0,1,2$, are satisfied. |
− | ===Convergence of the sequence | + | ===Convergence of the sequence $( T _ { n } )$.=== |
− | If | + | If $( S _ { n } )$ is an arbitrary convergent sequence, the sequence $( T _ { n } )$ obtained by the Aitken process can, in some cases, be non-convergent. Examples are known where $( T _ { n } )$ has two cluster points. However, if the sequence $( T _ { n } )$ converges, then its limit is also $S$, the limit of the sequence $( S _ { n } )$. It can be proved that if there are $a$, $b$ and $N$, $a < 1 < b$, such that $\Delta S _ { n + 1 } / \Delta S _ { n } \notin [ a , b ]$ for all $n \geq N$, then $( T _ { n } )$ converges to $S$. |
===Convergence acceleration properties of the Aitken process.=== | ===Convergence acceleration properties of the Aitken process.=== | ||
− | The problem here is to give conditions on | + | The problem here is to give conditions on $( S _ { n } )$ such that |
− | + | \begin{equation*} \operatorname { lim } _ { n \rightarrow \infty } \frac { T _ { n } - S } { S _ { n } - S } = 0. \end{equation*} | |
− | In that case, | + | In that case, $( T _ { n } )$ is said to converge faster than $( S _ { n } )$ or, in other words, that the Aitken process accelerates the convergence of $( S _ { n } )$. |
− | Intuitively, it is easy to understand that if the sequence | + | Intuitively, it is easy to understand that if the sequence $( S _ { n } )$ is not too far away from a sequence satisfying (a1), then its convergence will be accelerated. Indeed, if there is a $\lambda \neq 1$ such that |
− | + | \begin{equation} \tag{a2} \operatorname { lim } _ { n \rightarrow \infty } \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda, \end{equation} | |
− | then | + | then $( T _ { n } )$ will converge faster than $( S _ { n + 1} )$. Sequences satisfying (a2) are called linear. So, the Aitken process accelerates the convergence of the set of linear sequences. Moreover, if, in addition, $\lambda \neq 0$, then $( T _ { n } )$ converges faster than $( S _ { n + 2} )$. The condition $\lambda \neq 0$ is not a very restrictive one, since, when $\lambda = 0$, the sequence $( S _ { n } )$ already converges sufficiently fast and does not need to be accelerated. Note that it is important to prove the acceleration with respect to $( S _ { n+m } )$ with $m$ as large as possible (in general, $m$ corresponds to the last index used in the expression of $T _ { n }$ that is, for the Aitken process $m = 2$) since in certain cases it is possible that $( T _ { n } )$ converges faster than $( S _ { n+m } )$ but not faster than $( S _ { n + m + 1 } )$ for some values of $m$. The Aitken process is optimal for accelerating linear sequences, which means that it is not possible to accelerate the convergence of all linear sequences by a process using less than three successive terms of the sequence, and that the Aitken process is the only process using three terms that is able to do so [[#References|[a2]]]. It is the preceding acceleration result which makes the Aitken $\Delta ^ { 2 }$ process so popular, since many sequences coming out of well-known numerical algorithms satisfy (a2). This is, in particular, the case for the Rayleigh quotient method for computing the dominant eigenvalue of a matrix, for the Bernoulli method for obtaining the dominant zero of a polynomial or for fixed-point iterations with linear convergence. |
− | The Aitken process is also able to accelerate the convergence of some sequences for which | + | The Aitken process is also able to accelerate the convergence of some sequences for which $\lambda = 1$ in (a2). Such sequences are called logarithmic. They converge more slowly than the linear ones and they are the most difficult sequences to accelerate. Note that an algorithm able to accelerate the convergence of all logarithmic sequences cannot exist. |
− | If the Aitken process is applied to the sequence | + | If the Aitken process is applied to the sequence $S _ { n } = \sum _ { i = 0 } ^ { n } c _ { i } t ^ { i }$ of partial sums of a series $f$, then $T _ { n }$ is identical to the Padé approximant $[ n / 1 ]_{ f } ( t )$ (cf. also [[Padé approximation|Padé approximation]]). For example, apply the Aitken process to the sequence $( S _ { n } )$ of partial sums of the series $\operatorname { ln } ( 1 + t ) = t - t ^ { 2 } / 2 + t ^ { 3 } / 3 - \dots$. It is well known that it converges for $- 1 < t \leq 1$. So, for $t = 1$, $10 ^ { 16 }$ terms of the series are needed to obtain $\operatorname { ln } 2$ with a precision of $10 ^ { - 16 }$. Applying the Aitken process to $( S _ { n } )$, then again to $( T _ { n } )$ and so on (a procedure called the iterated $\Delta ^ { 2 }$ process), this precision is achieved with only $23$ terms. Quite similar results can be obtained for $t = 2$, in which case the sequence $( S _ { n } )$ is diverging. |
− | There exist several generalizations of the Aitken | + | There exist several generalizations of the Aitken $\Delta ^ { 2 }$ process for scalar sequences, the most well-known being the Shanks transformation, which is usually implemented via the $\varepsilon$-algorithm of P. Wynn. There are also vector generalizations of the Aitken process, adapted more specifically to the acceleration of vector sequences. |
− | The Aitken process also leads to new methods in [[Numerical analysis|numerical analysis]]. For example, for solving the fixed-point problem | + | The Aitken process also leads to new methods in [[Numerical analysis|numerical analysis]]. For example, for solving the fixed-point problem $x = F ( x )$, consider the following method. It consists in applying the Aitken process to $u_{0},u_{1}$ and $u_2$: |
− | + | \begin{equation*} u _ { 0 } = x _ { n }, \end{equation*} | |
− | + | \begin{equation*} u _ { 1 } = F ( u _ { 0 } ) , u _ { 2 } = F ( u _ { 1 } ), \end{equation*} | |
− | + | \begin{equation*} x _ { n + 1 } = u _ { 0 } - \frac { \Delta u _ { 0 } } { \Delta ^ { 2 } u _ { 0 } }. \end{equation*} | |
− | This is a method due to J.F. Steffensen and its convergence is quadratic (as in the [[Newton method|Newton method]]) under the assumption that | + | This is a method due to J.F. Steffensen and its convergence is quadratic (as in the [[Newton method|Newton method]]) under the assumption that $F ^ { \prime } ( x ) \neq 1$ (same assumption as in Newton's method). |
For all these convergence acceleration methods, see [[#References|[a1]]], [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]]. | For all these convergence acceleration methods, see [[#References|[a1]]], [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]]. | ||
====References==== | ====References==== | ||
− | <table>< | + | <table> |
+ | <tr><td valign="top">[a1]</td> <td valign="top"> C. Brezinski, M. Redivo Zaglia, "Extrapolation methods. Theory and practice" , North-Holland (1991)</td></tr><tr><td valign="top">[a2]</td> <td valign="top"> J.P. Delahaye, "Sequence transformations" , Springer (1988)</td></tr><tr><td valign="top">[a3]</td> <td valign="top"> G. Walz, "Asymptotics and extrapolation" , Akad. Verlag (1996)</td></tr><tr><td valign="top">[a4]</td> <td valign="top"> J. Wimp, "Sequence transformations and their applications" , Acad. Press (1981)</td></tr> | ||
+ | </table> |
Latest revision as of 20:20, 25 January 2024
One of the most famous methods for accelerating the convergence of a given sequence.
Let $( S _ { n } )$ be a sequence of numbers converging to $S$. The Aitken $\Delta ^ { 2 }$ process consists of transforming $( S _ { n } )$ into the new sequence $( T _ { n } )$ defined, for $n = 0,1 , \dots$, by
\begin{equation*} T _ { n } = \frac { S _ { n } S _ { n + 2 } - S _ { n + 1 } ^ { 2 } } { S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n } } = S _ { n } - \frac { \Delta S _ { n } } { \Delta ^ { 2 } S _ { n } }, \end{equation*}
where $\Delta S _ { n } = S _ { n + 1 } - S _ { n }$ and $\Delta ^ { 2 } S _ { n } = \Delta S _ { n + 1 } - \Delta S _ { n } = S _ { n + 2 } - 2 S _ { n + 1 } + S _ { n }$.
The first formula above is numerically unstable since, when the terms are close to $S$, cancellation arises in the numerator and in the denominator. Of course, such a cancellation also occurs in the second formula, however only in the computation of a correcting term to $S _ { n }$. Thus, cancellation appears as a second-order error and it follows that the second formula is more stable than the first one, which is only used for theoretical purposes.
Such a process was proposed in 1926 by A.C. Aitken, but it can be traced back to Japanese mathematicians of the 17th century.
An important algebraic property of an extrapolation algorithm, such as Aitken's, is its kernel, that is the set of sequences which are transformed into a constant sequence. It is easy to see that the kernel of the Aitken $\Delta ^ { 2 }$ process is the set of sequences of the form $S _ { n } = S + \alpha \lambda ^ { n }$ for $n = 0,1 , \dots$, with $\lambda \neq 0,1$ or, in other words, such that, for all $n$, $a _ { 1 } ( S _ { n } - S ) + a _ { 2 } ( S _ { n + 1 } - S ) = 0$ with $a _ { 1 } + a _ { 2 } \neq 0$. If $| \lambda | < 1$, $( S _ { n } )$ converges to $S$. However, it must be noted that this result is true even if $\lambda | > 1$, that is, even if the sequence $( S _ { n } )$ is diverging. In other words, the kernel is the set of sequences such that, for all $n$,
\begin{equation} \tag{a1} \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda \neq 0,1. \end{equation}
If the Aitken process is applied to such a sequence, then $T _ { n } = S$ for all $n$.
This result also shows that the Aitken process is an extrapolation algorithm. Indeed, it consists of computing $T _ { n }$, $\alpha$ and $\lambda$ such that the interpolation conditions $S _ { n + i } = T _ { n } + \alpha \lambda ^ { n + i }$, $i = 0,1,2$, are satisfied.
Convergence of the sequence $( T _ { n } )$.
If $( S _ { n } )$ is an arbitrary convergent sequence, the sequence $( T _ { n } )$ obtained by the Aitken process can, in some cases, be non-convergent. Examples are known where $( T _ { n } )$ has two cluster points. However, if the sequence $( T _ { n } )$ converges, then its limit is also $S$, the limit of the sequence $( S _ { n } )$. It can be proved that if there are $a$, $b$ and $N$, $a < 1 < b$, such that $\Delta S _ { n + 1 } / \Delta S _ { n } \notin [ a , b ]$ for all $n \geq N$, then $( T _ { n } )$ converges to $S$.
Convergence acceleration properties of the Aitken process.
The problem here is to give conditions on $( S _ { n } )$ such that
\begin{equation*} \operatorname { lim } _ { n \rightarrow \infty } \frac { T _ { n } - S } { S _ { n } - S } = 0. \end{equation*}
In that case, $( T _ { n } )$ is said to converge faster than $( S _ { n } )$ or, in other words, that the Aitken process accelerates the convergence of $( S _ { n } )$.
Intuitively, it is easy to understand that if the sequence $( S _ { n } )$ is not too far away from a sequence satisfying (a1), then its convergence will be accelerated. Indeed, if there is a $\lambda \neq 1$ such that
\begin{equation} \tag{a2} \operatorname { lim } _ { n \rightarrow \infty } \frac { S _ { n + 1 } - S } { S _ { n } - S } = \lambda, \end{equation}
then $( T _ { n } )$ will converge faster than $( S _ { n + 1} )$. Sequences satisfying (a2) are called linear. So, the Aitken process accelerates the convergence of the set of linear sequences. Moreover, if, in addition, $\lambda \neq 0$, then $( T _ { n } )$ converges faster than $( S _ { n + 2} )$. The condition $\lambda \neq 0$ is not a very restrictive one, since, when $\lambda = 0$, the sequence $( S _ { n } )$ already converges sufficiently fast and does not need to be accelerated. Note that it is important to prove the acceleration with respect to $( S _ { n+m } )$ with $m$ as large as possible (in general, $m$ corresponds to the last index used in the expression of $T _ { n }$ that is, for the Aitken process $m = 2$) since in certain cases it is possible that $( T _ { n } )$ converges faster than $( S _ { n+m } )$ but not faster than $( S _ { n + m + 1 } )$ for some values of $m$. The Aitken process is optimal for accelerating linear sequences, which means that it is not possible to accelerate the convergence of all linear sequences by a process using less than three successive terms of the sequence, and that the Aitken process is the only process using three terms that is able to do so [a2]. It is the preceding acceleration result which makes the Aitken $\Delta ^ { 2 }$ process so popular, since many sequences coming out of well-known numerical algorithms satisfy (a2). This is, in particular, the case for the Rayleigh quotient method for computing the dominant eigenvalue of a matrix, for the Bernoulli method for obtaining the dominant zero of a polynomial or for fixed-point iterations with linear convergence.
The Aitken process is also able to accelerate the convergence of some sequences for which $\lambda = 1$ in (a2). Such sequences are called logarithmic. They converge more slowly than the linear ones and they are the most difficult sequences to accelerate. Note that an algorithm able to accelerate the convergence of all logarithmic sequences cannot exist.
If the Aitken process is applied to the sequence $S _ { n } = \sum _ { i = 0 } ^ { n } c _ { i } t ^ { i }$ of partial sums of a series $f$, then $T _ { n }$ is identical to the Padé approximant $[ n / 1 ]_{ f } ( t )$ (cf. also Padé approximation). For example, apply the Aitken process to the sequence $( S _ { n } )$ of partial sums of the series $\operatorname { ln } ( 1 + t ) = t - t ^ { 2 } / 2 + t ^ { 3 } / 3 - \dots$. It is well known that it converges for $- 1 < t \leq 1$. So, for $t = 1$, $10 ^ { 16 }$ terms of the series are needed to obtain $\operatorname { ln } 2$ with a precision of $10 ^ { - 16 }$. Applying the Aitken process to $( S _ { n } )$, then again to $( T _ { n } )$ and so on (a procedure called the iterated $\Delta ^ { 2 }$ process), this precision is achieved with only $23$ terms. Quite similar results can be obtained for $t = 2$, in which case the sequence $( S _ { n } )$ is diverging.
There exist several generalizations of the Aitken $\Delta ^ { 2 }$ process for scalar sequences, the most well-known being the Shanks transformation, which is usually implemented via the $\varepsilon$-algorithm of P. Wynn. There are also vector generalizations of the Aitken process, adapted more specifically to the acceleration of vector sequences.
The Aitken process also leads to new methods in numerical analysis. For example, for solving the fixed-point problem $x = F ( x )$, consider the following method. It consists in applying the Aitken process to $u_{0},u_{1}$ and $u_2$:
\begin{equation*} u _ { 0 } = x _ { n }, \end{equation*}
\begin{equation*} u _ { 1 } = F ( u _ { 0 } ) , u _ { 2 } = F ( u _ { 1 } ), \end{equation*}
\begin{equation*} x _ { n + 1 } = u _ { 0 } - \frac { \Delta u _ { 0 } } { \Delta ^ { 2 } u _ { 0 } }. \end{equation*}
This is a method due to J.F. Steffensen and its convergence is quadratic (as in the Newton method) under the assumption that $F ^ { \prime } ( x ) \neq 1$ (same assumption as in Newton's method).
For all these convergence acceleration methods, see [a1], [a2], [a3], [a4].
References
[a1] | C. Brezinski, M. Redivo Zaglia, "Extrapolation methods. Theory and practice" , North-Holland (1991) |
[a2] | J.P. Delahaye, "Sequence transformations" , Springer (1988) |
[a3] | G. Walz, "Asymptotics and extrapolation" , Akad. Verlag (1996) |
[a4] | J. Wimp, "Sequence transformations and their applications" , Acad. Press (1981) |
Aitken Delta^2 process. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Aitken_Delta%5E2_process&oldid=15250