A priori and a posteriori bounds in matrix computations
Three sources of errors are behind the lack of accuracy in numerical computations: data errors associated with the physical model, truncation errors when series are truncated to a number of terms, and rounding errors resulting from finite machine precision. Therefore, for the equation $ f ( a,x ) = 0 $
in which $ a $
is a parameter, if an error occurs in $ a $,
the algorithm returns $ {\widehat{x} } $
and not $ x $.
How much the computed solution $ {\widehat{x} } $
differs from $ x $
is an indication of the computational accuracy of the result. Since $ x $
is unknown, the norm $ | { {\widehat{x} } - x } | $,
taken as an indication of the accuracy, cannot be calculated, and numerical analysts have devised bounds for the latter in terms of the variation $ \delta a $
in $ a $.
These bounds are of two types: a priori bounds and a posteriori bounds. The first are applied prior to computation and are usually poor in nature, whereas the second use information extracted from a computation, and are more indicative. In this article, attention is focussed on the most widely available bounds which can be implemented to estimate the accuracy of results and to check the stability of numerical algorithms (cf. also Stability of a computational process).
Linear matrix equations $ Ax = b $, $ A \in \mathbf R ^ {n \times n } $, $ { \mathop{\rm det} } ( A ) \neq 0 $.
Assuming that the computed solution $ {\widehat{x} } $ is equal to $ x + \delta x $, where $ x = A ^ {- 1 } b $ and $ \delta x $ is the error in the solution resulting from a perturbation $ \delta A $ in $ A $ and $ \delta b $ in $ b $, the perturbed problem
$$ \tag{a1 } ( A + \delta A ) ( x + \delta x ) = b + \delta b $$
implies, by cancelling equal terms,
$$ \tag{a2 } \delta x = A ^ {- 1 } ( - \delta Ax - \delta A \delta x + \delta b ) . $$
It follows that for a consistent matrix-vector norm [a1]
$$ \tag{a3 } \left \| {\delta x } \right \| \leq \left \| {A ^ {- 1 } \delta A } \right \| \left \| x \right \| + \left \| {A ^ {- 1 } \delta A } \right \| \left \| {\delta x } \right \| + \left \| {A ^ {- 1 } \delta b } \right \| , $$
which implies that under the conditions $ \| {A ^ {- 1 } \delta A } \| < 1 $,
$$ \tag{a4 } { \frac{\left \| {\delta x } \right \| }{\left \| x \right \| } } \leq { \frac{\left \| {A ^ {- 1 } \delta A } \right \| + { \frac{\left \| {A ^ {- 1 } \delta b } \right \| }{\left \| x \right \| } } }{1 - \left \| {A ^ {- 1 } \delta A } \right \| } } . $$
Setting $ k ( A ) = \| {A ^ {- 1 } } \| \| A \| $, from $ \| b \| \leq \| A \| \| x \| $ one finds
$$ \tag{a5 } { \frac{\delta x }{\left \| x \right \| } } \leq { \frac{k ( A ) }{1 - k ( A ) { \frac{\left \| {\delta A } \right \| }{\left \| A \right \| } } } } \left ( { \frac{\left \| {\delta A } \right \| }{\left \| A \right \| } } + { \frac{\left \| {\delta b } \right \| }{\left \| b \right \| } } \right ) , $$
which measures the relative error in the solution $ x $ resulting from the errors $ \delta A $ and $ \delta b $ in the data. Thus, the factor $ k ( A ) $, called the condition number of $ A $ is the main factor affecting computational accuracy, since by setting $ \| {\delta A } \| \leq \epsilon \| A \| $, $ \| {\delta b } \| \leq \epsilon \| b \| $, where $ \epsilon $ is the relative error in the data, it follows that
$$ \tag{a6 } k ( A ) = {\lim\limits } _ {\epsilon \rightarrow0 } { \frac{1} \epsilon } { \frac{\left \| {\delta x } \right \| }{\left \| x \right \| } } , $$
i.e., representing the sensitivity of the solution to relative changes $ \epsilon $ in either $ A $ or $ b $, that is, $ { {\| {\delta x } \| } / {\| x \| } } \approx \epsilon \cdot k ( A ) $. So, if $ k ( A ) = 10 ^ {p} $ and $ \epsilon = 5 \cdot 10 ^ {- t } $, where $ t $ is the machine mantissa, the number of significant digits in $ {\widehat{x} } $ becomes $ t - p $.
The above analysis lacks accuracy, for the error in $ \| A \| $ is not equal to $ \epsilon \| A \| $, since other factors intervene, one of which is the growth factor in the elements associated with the interchange strategy as well as the size of $ \| A \| $, yet it provides an a priori estimate for the expected accuracy. Before the late J. Wilkinson it was thought that rounding errors can ruin the solution completely; he showed than an upper bound exists for $ \| {\delta A } \| $ and that fear of obtaining poor results is unwarranted [a2].
On the contrary, one can also reach total machine accuracy. Consider, for example, the matrix
$$ A = \left ( \begin{array}{cc} 10 ^ {5} & 0 \\ 0 &10 ^ {- 5 } \\ \end{array} \right ) , $$
$ \| A \| = 10 ^ {5} $, $ \| {A ^ {- 1 } } \| = 10 ^ {5} $, and $ k ( A ) = 10 ^ {10 } $, so one expects to loose all significant digits on a $ 10 $- digit machine, contrary to the fact that for any right-hand side $ b $, full accuracy in this situation is reached. This led R. Skeel [a3] to consider componentwise estimates of $ \delta x $. For $ | {\delta A } | \leq \epsilon | A | $, $ | {\delta b } | \leq \epsilon | b | $, where $ | \cdot | $ stands for the modulus of the vector taken componentwise, one obtains from (a2):
$$ \tag{a7 } \left | {\delta x } \right | \leq \left | {A ^ {- 1 } \delta A } \right | \left | x \right | + \left | {A ^ {- 1 } \delta A } \right | \left | {\delta x } \right | + \left | {A ^ {- 1 } \delta b } \right | , $$
giving for $ \rho ( | {A ^ {- 1 } \delta A } | ) < 1 $, where $ \rho $ is the spectral radius, that
$$ \tag{a8 } \left | {\delta x } \right | \leq \left ( I - \left | {A ^ {- 1 } \delta A } \right | \right ) ^ {- 1 } \left ( \left | {A ^ {- 1 } \delta A } \right | \left | x \right | + \left | {A ^ {- 1 } \delta b } \right | \right ) . $$
For relative perturbations $ \epsilon $ in $ A $ and $ b $, this implies that
$$ \tag{a9 } { \frac{\left \| {\delta x } \right \| }{\left \| x \right \| } } \leq { \frac{2 \epsilon \left \| { \left | {A ^ {- 1 } } \right | \left | A \right | } \right \| }{1 - \epsilon \left \| { \left | {A ^ {- 1 } } \right | \left | A \right | } \right \| } } , $$
giving
$$ \tag{a10 } k ( A ) = \left \| {\left | {A ^ {- 1 } } \right | \left | A \right | } \right \| , $$
which is a better sensitivity criterion. For the above $ ( 2 \times2 ) $- matrix one has $ k ( A ) = 1 $, as expected.
However, a practical bound for the error in $ {\widehat{x} } $ relies on information extracted from computation, since it shows the true state of affairs. If $ r = A {\widehat{x} } - b $ is the residual error of the true solution, then by writing $ r = A ( x + \delta x ) - b = A \delta x $ one finds $ \delta x = A ^ {- 1 } r $, hence the a posteriori bound
$$ \tag{a11 } { \frac{\left \| {\delta x } \right \| }{\left \| x \right \| } } \leq \left \| {A ^ {- 1 } } \right \| { \frac{\left \| r \right \| }{\left \| x \right \| } } \leq k ( A ) { \frac{\left \| r \right \| }{\left \| b \right \| } } . $$
This shows that $ k ( A ) $ intervenes also here and that $ \| r \| $ can be found very small, yet $ {\widehat{x} } $ is totally inaccurate as a result of the large $ k ( A ) $, [a4]. The above bound can still be improved by noticing that since $ A ^ {- 1 } $ is unknown and only an approximate inverse matrix $ B $ of $ A $ is known, it can be written in the practical form
$$ \tag{a12 } \left \| {\delta x } \right \| = \left \| {A ^ {- 1 } B ^ {- 1 } Br } \right \| = $$
$$ = \left \| {( I - ( I - BA ) ) ^ {- 1 } Br } \right \| \leq $$
$$ \leq { \frac{\left \| {Br } \right \| }{1 - \left \| {I - BA } \right \| } } , $$
provided that $ \| {I - BA } \| < 1 $.
Although scientists have elaborated on various bounds for estimating the accuracy of their results, the situation is not so bleak, for the above bounds are not the end of the story. After all, $ {\widehat{x} } $ can be ameliorated by a technique called the method of iterative refinement, very similar to Newton's method for solving non-linear equations (cf. Newton method). Today it is an available facility in most packages (Linpack, Lapack, etc.) and runs as follows:
a) compute $ r = A {\widehat{x} } - b $;
b) solve $ A \delta x = r $;
c) $ {\widehat{x} } _ {\textrm{ new } } = {\widehat{x} } - \delta x $, re-do from a). The algorithm converges very rapidly and $ {\widehat{x} } \rightarrow x $.
Finally, accuracy of $ {\widehat{x} } $ is not the only issue which matters to users. There is also the question of stability of the algorithms. By the stability of $ {\widehat{x} } $ one means that it satisfies the perturbed problem
$$ \tag{a13 } ( A + \delta A ) {\widehat{x} } = b + \delta b, $$
in which $ \delta A $ and $ \delta b $ are of the order of the machine allowed uncertainty. An algorithm which returns $ {\widehat{x} } $ such that its backward errors $ \delta A $ and $ \delta b $ are larger than what is anticipated in terms of the machine eps, is unstable. This is checked using the Oettli–Prager criterion [a5]
$$ \tag{a14 } \left | {A {\widehat{x} } - b } \right | \leq \Delta A \left | { {\widehat{x} } } \right | + \Delta b, $$
which has been derived for interval equations $ [ A \pm \Delta A ] {\widehat{x} } = [ b \pm \Delta b ] $ but is also suitable for systems subject to uncertainties of the form $ \Delta A = \epsilon | A | $ and $ \Delta b = \epsilon | b | $. Thus follows the stability criterion $ | r | \leq \epsilon ( | A | | { {\widehat{x} } } | + | b | ) $; a result implemented in most available packages by what is called a performance index. (For instance, in the IMSL it is given by
$$ \tag{a15 } p = \max _ {1 \leq i \leq n } { \frac{\left | {b _ {i} - \sum _ {j = 1 } ^ { n } a _ {ij } {\widehat{x} } _ {j} } \right | }{BN + AN \cdot \sum _ {j = 1 } ^ { n } \left | { {\widehat{x} } _ {j} } \right | } } , $$
where $ BN = \max _ {1 \leq i \leq n } | {b _ {i} } | $, $ AN = \max _ {1 \leq i,j \leq n } | {a _ {ij } } | $.) If $ p < \textrm{ machine eps } $, the solution is stable [a4]. The backward error can be given by $ \delta A = - H \cdot | A | \cdot { \mathop{\rm diag} } ( { \mathop{\rm sgn} } ( {\widehat{x} } _ {i} ) ) $, $ \delta b = H \cdot | b | $, where $ H $ is a diagonal matrix and $ h _ {ii } = { {r _ {i} } / {( | A | | { {\widehat{x} } } | + | b | ) _ {i} } } $, with $ | {h _ {ii } } | < \textrm{ machine eps } $.
Linear equations $ Ax = b $, $ A \in \mathbf R ^ {m \times n } $.
Independent of whether the equations have a unique solution (as in the previous section), more than one solution or no solution (only a least-squares solution), the general solution $ x $ is given by
$$ \tag{a16 } x = A ^ {+} b + ( I - A ^ {+} A ) c, $$
where $ c $ is an arbitrary vector and $ A ^ {+} $ is the Penrose–Moore inverse of $ A $, satisfying $ AA ^ {+} A = A $, $ A ^ {+} AA ^ {+} = A ^ {+} $, $ ( A ^ {+} A ) ^ {T} = A ^ {+} A $ and $ ( AA ^ {+} ) ^ {T} = AA ^ {+} $. For a consistent set of equations the residual $ Ax - b = ( AA ^ {+} - I ) b = 0 $, is a special case. The minimal least-squares solution becomes, therefore, $ x = A ^ {+} b $. Unlike in the previous section, if $ A $ undergoes a perturbation $ \epsilon A _ {1} $, in general $ ( A + \epsilon A _ {1} ) ^ {+} $ cannot be expanded into a Taylor series in $ \epsilon $; it does have a Laurent expansion [a4]. For acute perturbations ( $ { \mathop{\rm rank} } ( A ) = { \mathop{\rm rank} } ( A + \epsilon A _ {1} ) $), from [a6] one finds that
$$ \tag{a17 } \left \| {( A + \delta A ) ^ {+} } \right \| _ {2} \leq { \frac{\left \| {A ^ {+} } \right \| _ {2} }{1 - \left \| {A ^ {+} } \right \| _ {2} \left \| {\delta A } \right \| _ {2} } } $$
if
$$ \left \| {( A + \delta A ) ^ {+} } \right \| _ {2} = { \frac{1}{\sigma _ {r} ( A + \delta A ) } } , $$
$$ \left \| {A ^ {+} } \right \| _ {2} = { \frac{1}{\sigma _ {r} ( A ) } } , $$
$$ \sigma _ {i} ( A ) - \sigma _ {1} ( \delta A ) \leq \sigma _ {i} ( A + \delta A ) \leq \sigma _ {i} ( A ) + \sigma _ {i} ( \delta A ) , $$
$$ i = 1 \dots r, $$
with $ \sigma _ {1} \geq \dots \geq \sigma _ {r} $ the singular values of $ A $. P. Wedin also showed that for any $ A $ and $ \delta A $,
$$ \tag{a18 } { \frac{\left \| {( A + \delta A ) ^ {+} - A ^ {+} } \right \| }{\left \| {( A + \delta A ) ^ {+} } \right \| _ {2} } } \leq \mu \left \| {A ^ {+} } \right \| _ {2} \left \| {\delta A } \right \| _ {2} , $$
where $ \mu $ has tabulated values for various cases of $ { \mathop{\rm rank} } ( A ) $ in relation to $ m $ and $ n $. It therefore follows that the bound for the relative error in $ A ^ {+} $ is given by
$$ \tag{a19 } { \frac{\left \| {( A + \delta A ) ^ {+} - A ^ {+} } \right \| }{\left \| {A ^ {+} } \right \| _ {2} } } \leq \mu { \frac{k ( A ) { \frac{\left \| {\delta A } \right \| _ {2} }{\left \| A \right \| _ {2} } } }{1 - k ( A ) { \frac{\left \| {\delta A } \right \| _ {2} }{\left \| A \right \| _ {2} } } } } , $$
where $ k ( A ) = \| A \| _ {2} \| {A ^ {+} } \| _ {2} $ is the spectral condition number, which measures the sensitivity of $ A ^ {+} $ to acute perturbations in $ A $. To obtain a bound for $ { {\| {\delta x } \| } / {\| x \| } } $ one starts with $ x + \delta x = ( A + \delta A ) ^ {+} ( b + \delta b ) $ to find that
$$ \tag{a20 } \delta x = \left [ ( A + \delta A ) ^ {+} - A ^ {+} \right ] b + ( A + \delta A ) ^ {+} \delta b. $$
Using a decomposition theorem [a6] for $ ( A + \delta A ) ^ {+} - A ^ {+} $, one finds that [a7]
$$ \tag{a21 } { \frac{\left \| {\delta x } \right \| _ {2} }{\left \| x \right \| _ {2} } } \leq {\widehat{k} } \left [ ( 2 + \eta {\widehat{k} } ) \alpha + \beta \gamma \right ] , $$
where
$$ \tag{a22 } \alpha = { \frac{\left \| {\delta A } \right \| _ {2} }{\left \| A \right \| _ {2} } } , \quad {\widehat{k} } = { \frac{k ( A ) }{1 - \alpha k ( A ) } } , $$
$$ \eta = { \frac{\left \| r \right \| _ {2} }{\left \| A \right \| _ {2} \left \| x \right \| _ {2} } } , \quad \beta = { \frac{\left \| {\delta b } \right \| _ {2} }{\left \| b \right \| _ {2} } } , \quad \gamma = { \frac{\left \| b \right \| _ {2} }{\left \| A \right \| _ {2} \left \| x \right \| _ {2} } } . $$
Note that the error is dominated by $ {\widehat{k} } ^ {2} ( A ) $, unlike the bound in (a19), which is dominated by $ {\widehat{k} } ( A ) $ only. For the special case $ { \mathop{\rm rank} } ( A ) = n < m $ the above expression becomes $ {\widehat{k} } [ ( 1 + \eta {\widehat{k} } ) \alpha + \beta \gamma ] $. For $ { \mathop{\rm rank} } ( A ) = m < n $ it becomes $ {\widehat{k} } ( 2 \alpha + \beta ) $. Finally, for $ { \mathop{\rm rank} } ( A ) = m = n $ it reduces to $ {\widehat{k} } ( \alpha + \beta ) $, which is the situation of the previous section. It should be noted that the above bounds are valid for acute perturbations and that the situation worsens if $ \delta A $ increases the rank of $ A $, in which case the accuracy of $ A ^ {+} $ deteriorates further as
$$ \left \| {( A + \delta A ) ^ {+} } \right \| _ {2} > { \frac{1}{\left \| {\delta A } \right \| _ {2} } } . $$
Again one can follow an iterative scheme of refinement similar to the one mentioned in the previous section to improve on $ A ^ {+} $. It runs as follows for the full rank situation:
a) $ R = A ^ {T} - A ^ {T} AB $;
b) $ \delta A ^ {+} = - BB ^ {T} R $;
c) $ A ^ {+} _ {\textrm{ new } } = B - \delta A ^ {+} $. The algorithm converges and $ B \rightarrow ( A ^ {T} A ) ^ {- 1 } A ^ {T} $.
The eigenvalue problem $ Ax = \lambda x $, $ A \in \mathbf R ^ {n \times n } $.
Unlike in the previous sections, a perturbation $ \delta A $ in $ A $ affects both the eigenvalue $ \lambda $ and its corresponding eigenvector $ x $. One of the earliest a priori bounds for $ | { {\widehat \lambda } - \lambda } | $, where $ {\widehat \lambda } $ is the eigenvalue of $ A + \delta A $, is the Bauer–Fike theorem [a8]. It states that the eigenvalues of $ A + \delta A $ lie in the union of the discs
$$ \tag{a23 } M _ {i} = \left \{ z : {\left | {z - \lambda _ {i} } \right | \leq \left \| {T ^ {- 1 } } \right \| \left \| T \right \| \left \| {\delta A } \right \| } \right \} , $$
$$ i = 1 \dots n, $$
where $ T $ is the modal matrix of $ A $( $ T ^ {- 1 } AT = \Lambda $) assuming $ A $ to be non-defective. The bound follows upon noticing that since
$$ {\widehat \lambda } I - A - \delta A = ( {\widehat \lambda } I - A ) \left [ I - ( {\widehat \lambda } I - A ) ^ {- 1 } \delta A \right ] $$
is singular, one has $ \| {( {\widehat \lambda } I - A ) ^ {- 1 } \delta A } \| > 1 $. Writing
$$ ( {\widehat \lambda } I - A ) ^ {- 1 } = T ( {\widehat \lambda } I - \Lambda ) ^ {- 1 } T ^ {- 1 } , $$
one finds that
$$ 1 \leq \left \| {T ( {\widehat \lambda } I - \Lambda ) ^ {- 1 } T ^ {- 1 } \delta A } \right \| \leq $$
$$ \leq \left \| T \right \| \left \| {T ^ {- 1 } } \right \| \left \| {\delta A } \right \| { \frac{1}{ { \mathop{\rm min} } _ { i } \left | { {\widehat \lambda } - \lambda _ {i} } \right | } } , $$
which implies (a23). The condition number to the problem is therefore of the order of $ k ( T ) = \| T \| \| {T ^ {- 1 } } \| $. Indeed, if the above discs are isolated, then $ { \mathop{\rm min} } _ {i} | { {\widehat \lambda } - \lambda _ {i} } | = | {\delta \lambda _ {i} } | $ and $ | {\delta \lambda _ {i} } | \leq k ( T ) \| {\delta A } \| $. However, unless $ A $ is a normal matrix ( $ \| T \| _ {2} = \| {T ^ {- 1 } } \| _ {2} = 1 $), the bound in (a23) is known to yield pessimistic results. A more realistic criterion can be obtained. As $ I - ( {\widehat \lambda } I - A ) ^ {- 1 } \delta A $ is singular, then for a vector $ z \neq 0 $,
$$ z \leq \left | {( {\widehat \lambda } I - \Lambda ) ^ {- 1 } } \right | \left | {T ^ {- 1 } } \right | \left | {\delta A } \right | \left | T \right | \left | z \right | , $$
implying that [a9]
$$ \tag{a24 } { \mathop{\rm min} } _ { i } \left | { {\widehat \lambda } - \lambda _ {i} } \right | \leq \rho ( \left | {T ^ {- 1 } } \right | \left | {\delta A } \right | \left | T \right | ) , $$
which is tighter than (a23) for non-normal matrices. Moreover, (a24) is scale-invariant under the transformation $ A = DBD ^ {- 1 } $, where $ D $ is diagonal.
Both the Bauer–Fike bound and the above are valid for the whole spectrum; thus, a group of ill-conditioned eigenvalues affects the bound for the well-conditioned ones. This led Wilkinson to introduce his $ {1 / {s _ {i} } } $ factors, measuring the sensitivity of the $ i $ th eigenvalue alone. From first-order perturbation theory one finds, [a1],
$$ \tag{a25 } \delta \lambda _ {i} \approx { \frac{y ^ {i ^ {*} } \delta Ax ^ {i} }{y ^ {i ^ {*} } x ^ {i} } } , $$
where $ x ^ {i} $ and $ y ^ {i ^ {*} } $ are the eigenvector and eigenrow associated with a simple $ \lambda _ {i} $. So, normalizing both of them, $ {1 / {| {y ^ {i ^ {*} } x ^ {i} } | } } $ becomes the Wilkinson factor and is equal to $ { {| {\delta \lambda _ {i} } | } / {\| {\delta A } \| } } $. However, for a big $ \| {\delta A } \| $( a25) does not bound the true shift in $ \lambda _ {i} $. For this one uses a second Bauer–Fike theorem: If
$$ \delta A \leq { \frac{1}{n} } { \mathop{\rm min} } _ {k \neq i } { \frac{\left | {\lambda _ {i} - \lambda _ {k} } \right | }{\left \| {E _ {i} } \right \| + \left \| {E _ {k} } \right \| } } , $$
where $ E _ {i} $ is the eigenprojection of $ A $( i.e., $ E _ {i} = x ^ {i} y ^ {i ^ {*} } $), then [a8]
$$ \tag{a26 } \left | {\delta \lambda _ {i} } \right | \leq { \frac{\left \| {E _ {i} } \right \| \left \| {\delta A } \right \| }{1 - \left \| {\delta A } \right \| \sum _ {k \neq i } { \frac{\left \| {E _ {i} } \right \| + \left \| {E _ {k} } \right \| }{\left | {\lambda _ {i} - \lambda _ {k} } \right | } } } } . $$
It follows that $ \| {E _ {i} } \| $ represents the sensitivity of $ \lambda _ {i} $ to changes in $ A $. One can also show that [a10]
$$ \tag{a27 } { \frac{\left \| { {\widehat{x} } ^ {i} - x ^ {i} } \right \| }{\left \| {x ^ {i} } \right \| } } \leq { \frac \psi { { \mathop{\rm min} } _ {j \neq i } \left | {\lambda _ {i} - \lambda _ {j} } \right | - 2 \psi } } , $$
where $ \psi $ bounds the shifts in the eigenvalues. So, the shift in $ x ^ {i} $ is dependent on the separation of the $ i $ th eigenvalue from the nearest $ \lambda _ {j} $. Therefore, a matrix $ A $ has an ill-conditioned eigenvalue problem if $ { \mathop{\rm min} } _ {j \neq i } | {\lambda _ {i} - \lambda _ {j} } | < 2 \psi $ or if the discs containing $ {\widehat \lambda } _ {i} $, $ i = 1 \dots n $, intersect as a result of poor separation.
To obtain an a posteriori bound for the error in $ \lambda $ in terms of the residual one forms $ r = A {\widehat{x} } - {\widehat \lambda } {\widehat{x} } $, where $ ( {\widehat \lambda } , {\widehat{x} } ) $ is an approximate eigenpair. Then $ {\widehat{x} } = T ( \Lambda - {\widehat \lambda } I ) ^ {- 1 } T ^ {- 1 } r $, implying
$$ \left \| { {\widehat{x} } } \right \| \leq \left \| T \right \| \left \| {T ^ {- 1 } } \right \| \left \| r \right \| { \frac{1}{ { \mathop{\rm min} } _ { i } \left | { {\widehat \lambda } - \lambda _ {i} } \right | } } , $$
and [a2]
$$ \tag{a28 } { \mathop{\rm min} } _ { i } \left | { {\widehat \lambda } - \lambda _ {i} } \right | \leq k ( T ) { \frac{\left \| r \right \| }{\left \| { {\widehat{x} } } \right \| } } . $$
For symmetric matrices $ k _ {2} ( T ) = 1 $; thus, these have well-conditioned eigenproblems. For the non-symmetric case a powerful technique of H. Wielandt [a1] can update $ {\widehat{x} } $ by solving iteratively the linear equations
$$ \tag{a29 } ( A - {\widehat \lambda } I ) x ^ {( i + 1 ) } = x ^ {( i ) } , \quad i = 1 \dots n, $$
$$ x ^ {1} = {\widehat{x} } , $$
and $ x ^ {( i ) } \rightarrow x $ very rapidly.
To check the stability of the solution which satisfies $ ( A + \delta A ) {\widehat{x} } = {\widehat \lambda } {\widehat{x} } $, i.e., $ A {\widehat{x} } - {\widehat \lambda } {\widehat{x} } = - \delta A {\widehat{x} } $, one uses the criterion
$$ \tag{a30 } \left \| r \right \| = \left \| {A {\widehat{x} } - {\widehat \lambda } {\widehat{x} } } \right \| _ {2} \leq \epsilon \left \| A \right \| _ {2} \left \| { {\widehat{x} } } \right \| _ {2} , $$
which asserts that the system is backward stable and there exists a matrix $ \delta A = - { {r {\widehat{x} } ^ {*} } / {\| { {\widehat{x} } } \| _ {2} ^ {2} } } $ such that $ ( {\widehat \lambda } , {\widehat{x} } ) $ is an exact eigenpair of $ A + \delta A $. One can also construct the backwards error in the form $ \delta A = - H \cdot | A | \cdot { \mathop{\rm diag} } ( { \mathop{\rm sgn} } ( {\widehat{x} } _ {i} ) ) $, where $ | {h _ {ii } } | \leq \textrm{ machine eps } $, satisfying exactly $ ( A + \delta A ) {\widehat{x} } = {\widehat \lambda } {\widehat{x} } $ by letting $ r = H \cdot | A | \cdot | { {\widehat{x} } } | $. The above theory is valid for semi-simple eigenvalues only.
The case when $ A $ is defective (cf. Defective value) is more complicated, since $ {\widehat \lambda } $, for a perturbation $ \epsilon A _ {1} $ in $ A $, has a Puiseux expansion (cf. Puiseux series) in $ \epsilon $ in powers of $ {1 / m } $, where $ m $ is the multiplicity of $ \lambda $, namely [a1],
$$ \tag{a31 } {\widehat \lambda } = \lambda + \epsilon ^ { {1 / m } } \lambda _ {1} + \epsilon ^ { {2 / m } } \lambda _ {2} + \dots . $$
For an eigenvalue with index $ r $( the index is the size of the largest Jordan block) a bound for the shift in $ \lambda $ is given by [a10]
$$ \tag{a32 } { \mathop{\rm min} } _ { i } \left | { {\widehat \lambda } - \lambda _ {i} } \right | \leq \max \{ r \psi,r ^ { {1 / r } } \psi ^ { {1 / r } } \} . $$
Other bounds.
Modifications of the above cases are often encountered. For instance, the Sylvester equation
$$ \tag{a33 } AX + XB = C $$
and its special case $ B = A ^ {T} $, famous in control theory, can be transformed to the case of the first section by considering
$$ \tag{a34 } ( A \otimes I + I \otimes B ^ {T} ) { \mathop{\\vec{rm}t} } ( X ) = { \mathop{\\vec{rm}t} } ( C ) , $$
and a relative perturbation $ \epsilon $ in $ A $, $ B $, $ C $ yields $ {\widehat{X} } $ with an error [a13]
$$ \tag{a35 } { \frac{\left \| {\delta X } \right \| }{\left \| X \right \| } } \leq { \frac{\epsilon \cdot k ( A,B ) }{1 - \epsilon \cdot k ( A,B ) } } , $$
where $ k ( A,B ) $ is the condition number for the problem.
Also, transforming the eigenvalue problem
$$ \tag{a36 } ( A _ {n} \lambda ^ {n} + A _ {n - 1 } \lambda ^ {n - 1 } + \dots + A _ {0} ) x = 0, $$
in which $ A _ {i} \in \mathbf R ^ {n \times n } $, $ i = 0 \dots n - 1 $, into a generalized eigenvalue problem $ Cu = \lambda Bu $, where $ C $ is the companion matrix for (a36), while using the decomposition $ ( {\widehat \lambda } B - C ) ^ {- 1 } = P ( {\widehat \lambda } I - \Lambda ) ^ {- 1 } Q $, [a1], one obtains [a10]
$$ \tag{a37 } \left | { {\widehat \lambda } - \lambda } \right | \leq { \frac{\left \| P \right \| \left \| Q \right \| ( \left | \lambda \right | \left \| {\delta B } \right \| + \left \| {\delta C } \right \| ) }{1 - \left \| P \right \| \left \| Q \right \| \left \| {\delta B } \right \| } } , $$
which generalizes (a23).
Many bounds exist for various matrix functions, for instance [a1]
$$ \tag{a38 } \left \| {e ^ {A + \delta A } - e ^ {A} } \right \| \leq k ( T ) \cdot \left \| W \right \| , $$
where
$$ w _ {ij } = \left [ { {( e ^ {\lambda _ {i} } - e ^ {\lambda _ {j} } ) } / {( \lambda _ {i} - \lambda _ {j} ) } } \right ] \left \langle {y ^ {i} , \delta Ax ^ {j} } \right \rangle . $$
For a general matrix function $ F ( A ) $ approximated by another one $ G ( A ) $ it can be shown that [a14]
$$ \tag{a39 } \left \| {F ( A ) - G ( A ) } \right \| \leq $$
$$ \leq k ( T ) \max _ {1 \leq r \leq m - 1, 1 \leq i \leq p } { \frac{\left | {f ^ {( r ) } ( \lambda _ {i} ) - g ^ {( r ) } ( \lambda _ {i} ) } \right | }{r! } } m _ {i} , $$
where $ f $ and $ g $ are scalar functions applied to $ \lambda _ {i} $ in the sense that if $ F ( A ) = e ^ {A} $, then $ f ( \lambda ) = e ^ \lambda $, where $ m _ {i} $ is the size of the largest Jordan block associated with $ f ( \lambda _ {i} ) $, and $ f ^ {( r ) } ( \lambda ) $ is the $ r $- th derivative of $ f ( \lambda ) $.
References
[a1] | A. Deif, "Advanced matrix theory for scientists and engineers" , Gordon&Breach (1991) (Edition: Second) |
[a2] | J. Wilkinson, "The algebraic eigenvalue problem" , Clarendon Press (1965) |
[a3] | R. Skeel, "Scaling for numerical stability in Gaussian elimination" J. Assoc. Comput. Mach. , 26 (1979) pp. 494–526 |
[a4] | A. Deif, "Sensitivity analysis in linear systems" , Springer (1986) |
[a5] | W. Oettli, W. Prager, "Compatibility of approximate solution of linear equations with given error bounds for coefficients and right hand sides" Numer. Math. , 6 (1964) pp. 405–409 |
[a6] | P. Wedin, "Perturbation theory for pseudo-inverses" BIT , 13 (1973) pp. 217–232 |
[a7] | C. Lawson, R. Hanson, "Solving least-squares problems" , Prentice-Hall (1974) |
[a8] | F. Bauer, C. Fike, "Norms and exclusion theorems" Numer. Math. , 2 (1960) pp. 137–141 |
[a9] | A. Deif, "Realistic a priori and a posteriori bounds for computed eigenvalues" IMA J. Numer. Anal. , 9 (1990) pp. 323–329 |
[a10] | A. Deif, "Rigorous perturbation bounds for eigenvalues and eigenvectors of a matrix" J. Comput. Appl. Math. , 57 (1995) pp. 403–412 |
[a11] | G. Stewart, "Introduction to matrix computations" , Acad. Press (1973) |
[a12] | G. Stewart, J. Sun, "Matrix perturbation theory" , Acad. Press (1990) |
[a13] | A. Deif, N. Seif, S. Hussein, "Sylvester's equation: accuracy and computational stability" J. Comput. Appl. Math. , 61 (1995) pp. 1–11 |
[a14] | G. Golub, C. van Loan, "Matrix computations" , John Hopkins Univ. Press (1983) |
A priori and a posteriori bounds in matrix computations. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=A_priori_and_a_posteriori_bounds_in_matrix_computations&oldid=45208