Difference between revisions of "Bauer-Fike theorem"

Latest revision as of 17:44, 1 July 2020

As popularized in most texts on computational linear algebra or numerical methods, the Bauer–Fike theorem is a theorem on the perturbation of eigenvalues of a diagonalizable matrix. However, it is actually just one theorem out of a small collection of theorems on the localization of eigenvalues within small regions of the complex plane [a1].

A vector norm is a functional $|.|$ on a vector $x$ which satisfies three properties:

1) $| x | | \geq 0$, and $\| x \| = 0$ if and only if $x \equiv 0$.

2) for any scalar $\alpha$.

3) $\| x + y \| \leq \| x \| + \| y \|$ (the triangle inequality). A matrix norm $| A |$ satisfies the above three properties plus one more:

4) (the submultiplicative inequality). An operator norm is a matrix norm derived from a related vector norm:

5) $\| A \| = \operatorname { max } _ { x \neq 0 } \| A x \| / \| x \| = \operatorname { max } _ { \| x \| =1 } \| A x \|$. The most common vector norms are:

\begin{equation*} \| x \| _ { 1 } = \sum _ { i } | x_i |, \end{equation*}

\begin{equation*} \| x \| _ { 2 } = \left( \sum _ { i } | x _ { i } | ^ { 2 } \right) ^ { 1 / 2 } , \| x \| _ { \infty } = \operatorname { max } _ { i } | x _ { i } |, \end{equation*}

where $x_{i}$ denotes the $i$th entry of the vector $x$. The operator norms derived from these are, respectively:

\begin{equation*} \| A \| _ { 1 } = \operatorname { max } _ { i } \sum _ { j } | a _ { i j } |, \end{equation*}

\begin{equation*} \|A\|_{2}= \text{largest singular value of A} ,\, \| A \| _ { \infty } = \operatorname { max } _ { j } \sum _ { i } | a _ { i j} |, \end{equation*}

where $a _ { i j}$ denotes the $( i , j )$th entry of the matrix $A$. In particular, $\| A \| _ { \infty }$ is the "maximum absolute row sum" and $\| A \| _ { 1 }$ is the "maximum absolute column sum" . For details, see, e.g., [a2], Sec. 2.2–2.3.

Below, the notation $|.|$ will denote a vector norm when applied to a vector and an operator matrix norm when applied to a matrix.

Let $A$ be a real or complex $( n \times n )$-matrix that is diagonalizable (i.e. it has a complete set of eigenvectors $v _ { 1 } , \dots , v _ { n }$ corresponding to $n$ eigenvalues $\lambda _ { 1 } , \dots , \lambda _ { n }$, which need not be distinct; cf. also Eigen vector; Eigen value; Complete set). Its eigenvectors make up an $( n \times n )$-matrix $V$; let $\Lambda = \operatorname { diag } \{ \lambda _ { 1 } , \ldots , \lambda _ { n } \}$ be the diagonal matrix. Then $A = V \Lambda V ^ { - 1 }$. Let $E$ be an arbitrary $( n \times n )$-matrix. The popular Bauer–Fike theorem states that for such $A$ and $E$, every eigenvalue $\mu$ of the matrix $A + E$ satisfies the inequality

\begin{equation} \tag{a1} | \mu - \lambda | \leq \| V \| . \| V ^ { - 1 } \| . \| E \|, \end{equation}

where $\lambda$ is some arbitrary eigenvalue of $A$.

If $| E |$ is small, this theorem states that every eigenvalue of $A + E$ is close to some eigenvalue of $A$, and that the distance between eigenvalues of $A + E$ and eigenvalues of $A$ varies linearly with the perturbation $E$. But this is not an asymptotic result valid only as $E \rightarrow 0$; this result holds for any $E$. In this sense, it is a very powerful localization theorem for eigenvalues.

As an illustration, consider the following situation. Let $B$ be an arbitrary $( n \times n )$-matrix, let $A = \operatorname { diag } \{ b _ { 11 } , \dots , b _ { n n } \}$ be the diagonal part of $B$, and define $E = B - A$ to be the matrix of all off-diagonal entries. Let $\mu$ be an arbitrary eigenvalue of $B$. The matrix of eigenvectors for $A$ is just the identity matrix, so the Bauer–Fike theorem implies that $| \mu - b _ { i i } | \leq \| E \|$, for some diagonal entry $b _ {ii }$. The $\infty$-norm is just the maximum absolute row sum over the off-diagonal entries of $B$. In this norm the assertion means that every eigenvalue of $B$ lies in a circle centred at a diagonal entry of $B$ with radius equal to the maximum absolute row sum over the off-diagonal entries. This is a weak form of a Gershgorin-type theorem, which localizes the eigenvalues to circles centred at the diagonal entries, but the radius here is not as tight as in the true Gershgorin theory [a2], sec. 7.2.1, (cf. also Gershgorin theorem).

Another simple consequence of the Bauer–Fike theorem applies to symmetric, Hermitian or normal matrices (cf. also Symmetric matrix; Normal matrix; Hermitian matrix). If $A$ is symmetric, Hermitian or normal, then the matrix $V$ of eigenvectors can be made unitary (orthogonal if real; cf. also Unitary matrix), which means that $V ^ { \text{H} } V = I$, where $V ^ { \text{H} }$ denotes the complex conjugate transpose of $V$. Then $\| V \| _ { 2 } = \| V ^ { - 1 } \| _ { 2 } = 1$, and (a1) reduces to $| \mu - \lambda | < \| E \|$. Hence, for symmetric, Hermitian or normal matrices, the change to any eigenvalue is no larger than the norm of the change to the matrix itself.

The Bauer–Fike theorem as stated above has some limitations. The first is that it only applies to diagonalizable matrices. The second is that it says nothing about the correspondence between an eigenvalue $\mu$ of the perturbed matrix $A + E$ and the eigenvalues of the unperturbed matrix $A$. In particular, as $E$ increases from zero, it is difficult to predict how each eigenvalue of $A + E$ will move, and there is no way to predict which eigenvalue of the original $A$ corresponds to any particular eigenvalue $\mu$ of $A + E$, except in certain special cases such as symmetric or Hermitian matrices.

However, in their original paper [a1], F.L. Bauer and C.T. Fike presented several more general results to relieve the limitation to diagonalizable matrices. The most important of these is the following: Let $A$ be an arbitrary $( n \times n )$-matrix, and let $\mu$ denote any eigenvalue of $A + E$. Then either $\mu$ is also an eigenvalue of $A$ or

\begin{equation} \tag{a2} 1 \leq \| ( \mu I - A ) ^ { - 1 } \cdot E \| \leq \| ( \mu I - A ) ^ { - 1 } \| \cdot \| E \| . \end{equation}

This result follows from a simple matrix manipulation, using the norm properties given above:

\begin{equation*} ( A + E ) x = \mu x = ( \mu I ) x \Rightarrow \end{equation*}

\begin{equation*} \Rightarrow ( \mu I - A ) ^ { - 1 } . E x = x \Rightarrow \Rightarrow \| ( \mu I - A ) ^ { - 1 } . E \| . \| x \| \geq \| x \|. \end{equation*}

Even though this may appear to be a rather technical result, it actually leads to a great many often-used results. For example, if $A = V \Lambda V ^ { - 1 }$ is diagonalizable, it is easy to show that

\begin{equation*} \| ( \mu I - A ) ^ { - 1 } \| = \| V ( \mu I - A ) ^ { - 1 } V ^ { - 1 } \| \leq \end{equation*}

\begin{equation*} \leq \| V \| \cdot \| ( \mu I - A ) ^ { - 1 } \| \cdot \| V ^ { - 1 } \|, \end{equation*}

which leads immediately to the popular Bauer–Fike theorem (using the fact that the operator norm of a diagonal matrix is just its largest entry in absolute value). One can also repeat the construction, in which $B$ is an arbitrary matrix, $A$ is the matrix consisting of the diagonal entries of $B$, and $E = B - A$ is the matrix of off- diagonal entries. Using the norm $\| . \| _ { \infty }$, the leftmost inequality in (a2) then reduces to

\begin{equation*} 1 \leq \operatorname { max } _ { i } \left( \frac { 1 } { | \mu - b _ { i i } | } . \sum _ { j \neq i } | b _ { i j } | \right), \end{equation*}

leading immediately to the first Gershgorin theorem (see also [a2], Sec. 7.2.1.

References

[a1]	F.L. Bauer, C.T. Fike, "Norms and exclusion theorems" Numer. Math. , 2 (1960) pp. 137–141
[a2]	G.H. Golub, C.F. Van Loan, "Matrix computations" , Johns Hopkins Univ. Press (1996) (Edition: Third)

How to Cite This Entry:
Bauer-Fike theorem. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Bauer-Fike_theorem&oldid=18577

This article was adapted from an original article by D.L. Boley (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

@@ Line 1: / Line 1: @@
+<!--This article has been texified automatically. Since there was no Nroff source code for this article,
+the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
+was used.
+If the TeX and formula formatting is correct and if all png images have been replaced by TeX code, please remove this message and the {{TEX|semi-auto}} category.
+Out of 98 formulas, 95 were replaced by TEX code.-->
+{{TEX|semi-auto}}{{TEX|part}}
 As popularized in most texts on computational [[Linear-algebra(2)|linear algebra]] or numerical methods, the Bauer–Fike theorem is a theorem on the perturbation of eigenvalues of a diagonalizable matrix. However, it is actually just one theorem out of a small collection of theorems on the localization of eigenvalues within small regions of the complex plane [[#References|[a1]]].
-A vector norm is a [[Functional|functional]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300601.png" /> on a vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300602.png" /> which satisfies three properties:
+A vector norm is a [[Functional|functional]] $|.|$ on a vector $x$ which satisfies three properties:
-) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300603.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300604.png" /> if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300605.png" />.
+) $| x | | \geq 0$, and $\| x \|  = 0$ if and only if $x \equiv 0$.
-) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300606.png" /> for any scalar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300607.png" />.
+) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300606.png"/> for any scalar $\alpha$.
-) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300608.png" /> (the triangle inequality). A matrix norm <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b1300609.png" /> satisfies the above three properties plus one more:
+) $\| x + y \| \leq \| x \| + \| y \|$ (the triangle inequality). A matrix norm $| A |$ satisfies the above three properties plus one more:
-) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006010.png" /> (the submultiplicative inequality). An operator norm is a matrix norm derived from a related vector norm:
+) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006010.png"/> (the submultiplicative inequality). An operator norm is a matrix norm derived from a related vector norm:
-) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006011.png" />. The most common vector norms are:
+) $\| A \| = \operatorname { max } _ { x \neq 0 } \| A x \| / \| x \| = \operatorname { max } _ { \|  x \| =1 }   \| A x \|$. The most common vector norms are:
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006012.png" /></td> </tr></table>
+\begin{equation*}  \| x \| _ { 1 }  = \sum _ { i } | x_i |,  \end{equation*}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006013.png" /></td> </tr></table>
+\begin{equation*} \| x \| _ { 2 } = \left( \sum _ { i } | x _ { i } | ^ { 2 } \right) ^ { 1 / 2 } , \| x \| _ { \infty } = \operatorname { max } _ { i } | x _ { i } |, \end{equation*}
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006014.png" /> denotes the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006015.png" />th entry of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006016.png" />. The operator norms derived from these are, respectively:
+where $x_{i}$ denotes the $i$th entry of the vector $x$. The operator norms derived from these are, respectively:
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006017.png" /></td> </tr></table>
+\begin{equation*} \| A \| _ { 1 } = \operatorname { max } _ { i } \sum _ { j } | a _ { i j } |, \end{equation*}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006018.png" /></td> </tr></table>
+\begin{equation*} \|A\|_{2}= \text{largest singular value of A} ,\, \| A \| _ { \infty } = \operatorname { max } _ { j } \sum _ { i } | a _ { i  j} |, \end{equation*}
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006019.png" /> denotes the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006020.png" />th entry of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006021.png" />. In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006022.png" /> is the  "maximum absolute row sum"  and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006023.png" /> is the  "maximum absolute column sum" . For details, see, e.g., [[#References|[a2]]], Sec. 2.2–2.3.
+where $a _ { i j}$ denotes the $( i , j )$th entry of the matrix $A$. In particular, $\| A \| _ { \infty }$ is the  "maximum absolute row sum"  and $\| A \| _ { 1 }$ is the  "maximum absolute column sum" . For details, see, e.g., [[#References|[a2]]], Sec. 2.2–2.3.
-Below, the notation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006024.png" /> will denote a vector norm when applied to a vector and an operator matrix norm when applied to a matrix.
+Below, the notation $|.|$ will denote a vector norm when applied to a vector and an operator matrix norm when applied to a matrix.
-Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006025.png" /> be a real or complex <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006026.png" />-matrix that is diagonalizable (i.e. it has a complete set of eigenvectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006027.png" /> corresponding to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006028.png" /> eigenvalues <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006029.png" />, which need not be distinct; cf. also [[Eigen vector|Eigen vector]]; [[Eigen value|Eigen value]]; [[Complete set|Complete set]]). Its eigenvectors make up an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006030.png" />-matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006031.png" />; let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006032.png" /> be the diagonal matrix. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006033.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006034.png" /> be an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006035.png" />-matrix. The popular Bauer–Fike theorem states that for such <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006036.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006037.png" />, every eigenvalue <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006038.png" /> of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006039.png" /> satisfies the inequality
+Let $A$ be a real or complex $( n \times n )$-matrix that is diagonalizable (i.e. it has a complete set of eigenvectors $v _ { 1 } , \dots , v _ { n }$ corresponding to $n$ eigenvalues $\lambda _ { 1 } , \dots , \lambda _ { n }$, which need not be distinct; cf. also [[Eigen vector|Eigen vector]]; [[Eigen value|Eigen value]]; [[Complete set|Complete set]]). Its eigenvectors make up an $( n \times n )$-matrix $V$; let $\Lambda = \operatorname { diag } \{ \lambda _ { 1 } , \ldots , \lambda _ { n } \}$ be the diagonal matrix. Then $A = V \Lambda V ^ { - 1 }$. Let $E$ be an arbitrary $( n \times n )$-matrix. The popular Bauer–Fike theorem states that for such $A$ and $E$, every eigenvalue $\mu$ of the matrix $A + E$ satisfies the inequality
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006040.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+\begin{equation} \tag{a1} | \mu - \lambda | \leq \| V \| . \| V ^ { - 1 } \| . \| E \|, \end{equation}
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006041.png" /> is some arbitrary eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006042.png" />.
+where $\lambda$ is some arbitrary eigenvalue of $A$.
-If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006043.png" /> is small, this theorem states that every eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006044.png" /> is close to some eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006045.png" />, and that the distance between eigenvalues of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006046.png" /> and eigenvalues of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006047.png" /> varies linearly with the perturbation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006048.png" />. But this is not an asymptotic result valid only as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006049.png" />; this result holds for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006050.png" />. In this sense, it is a very powerful localization theorem for eigenvalues.
+If $| E |$ is small, this theorem states that every eigenvalue of $A + E$ is close to some eigenvalue of $A$, and that the distance between eigenvalues of $A + E$ and eigenvalues of $A$ varies linearly with the perturbation $E$. But this is not an asymptotic result valid only as $E \rightarrow 0$; this result holds for any $E$. In this sense, it is a very powerful localization theorem for eigenvalues.
-As an illustration, consider the following situation. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006051.png" /> be an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006052.png" />-matrix, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006053.png" /> be the diagonal part of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006054.png" />, and define <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006055.png" /> to be the matrix of all off-diagonal entries. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006056.png" /> be an arbitrary eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006057.png" />. The matrix of eigenvectors for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006058.png" /> is just the identity matrix, so the Bauer–Fike theorem implies that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006059.png" />, for some diagonal entry <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006060.png" />. The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006061.png" />-norm <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006062.png" /> is just the maximum absolute row sum over the off-diagonal entries of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006063.png" />. In this norm the assertion means that every eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006064.png" /> lies in a circle centred at a diagonal entry of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006065.png" /> with radius equal to the maximum absolute row sum over the off-diagonal entries. This is a weak form of a Gershgorin-type theorem, which localizes the eigenvalues to circles centred at the diagonal entries, but the radius here is not as tight as in the true Gershgorin theory [[#References|[a2]]], sec. 7.2.1, (cf. also [[Gershgorin theorem|Gershgorin theorem]]).
+As an illustration, consider the following situation. Let $B$ be an arbitrary $( n \times n )$-matrix, let $A = \operatorname { diag } \{ b _ { 11 } , \dots , b _ { n n } \}$ be the diagonal part of $B$, and define $E = B - A$ to be the matrix of all off-diagonal entries. Let $\mu$ be an arbitrary eigenvalue of $B$. The matrix of eigenvectors for $A$ is just the identity matrix, so the Bauer–Fike theorem implies that $| \mu - b _ { i i } | \leq \| E \|$, for some diagonal entry $b _ {ii }$. The $\infty$-norm <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006062.png"/> is just the maximum absolute row sum over the off-diagonal entries of $B$. In this norm the assertion means that every eigenvalue of $B$ lies in a circle centred at a diagonal entry of $B$ with radius equal to the maximum absolute row sum over the off-diagonal entries. This is a weak form of a Gershgorin-type theorem, which localizes the eigenvalues to circles centred at the diagonal entries, but the radius here is not as tight as in the true Gershgorin theory [[#References|[a2]]], sec. 7.2.1, (cf. also [[Gershgorin theorem|Gershgorin theorem]]).
-Another simple consequence of the Bauer–Fike theorem applies to symmetric, Hermitian or normal matrices (cf. also [[Symmetric matrix|Symmetric matrix]]; [[Normal matrix|Normal matrix]]; [[Hermitian matrix|Hermitian matrix]]). If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006066.png" /> is symmetric, Hermitian or normal, then the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006067.png" /> of eigenvectors can be made unitary (orthogonal if real; cf. also [[Unitary matrix|Unitary matrix]]), which means that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006068.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006069.png" /> denotes the complex conjugate transpose of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006070.png" />. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006071.png" />, and (a1) reduces to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006072.png" />. Hence, for symmetric, Hermitian or normal matrices, the change to any eigenvalue is no larger than the norm of the change to the matrix itself.
+Another simple consequence of the Bauer–Fike theorem applies to symmetric, Hermitian or normal matrices (cf. also [[Symmetric matrix|Symmetric matrix]]; [[Normal matrix|Normal matrix]]; [[Hermitian matrix|Hermitian matrix]]). If $A$ is symmetric, Hermitian or normal, then the matrix $V$ of eigenvectors can be made unitary (orthogonal if real; cf. also [[Unitary matrix|Unitary matrix]]), which means that $V ^ { \text{H} } V = I$, where $V ^ { \text{H} }$ denotes the complex conjugate transpose of $V$. Then $\| V \| _ { 2 } = \| V ^ { - 1 } \| _ { 2 } = 1$, and (a1) reduces to $| \mu - \lambda | &lt; \| E \|$. Hence, for symmetric, Hermitian or normal matrices, the change to any eigenvalue is no larger than the norm of the change to the matrix itself.
-The Bauer–Fike theorem as stated above has some limitations. The first is that it only applies to diagonalizable matrices. The second is that it says nothing about the correspondence between an eigenvalue <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006073.png" /> of the perturbed matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006074.png" /> and the eigenvalues of the unperturbed matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006075.png" />. In particular, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006076.png" /> increases from zero, it is difficult to predict how each eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006077.png" /> will move, and there is no way to predict which eigenvalue of the original <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006078.png" /> corresponds to any particular eigenvalue <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006079.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006080.png" />, except in certain special cases such as symmetric or Hermitian matrices.
+The Bauer–Fike theorem as stated above has some limitations. The first is that it only applies to diagonalizable matrices. The second is that it says nothing about the correspondence between an eigenvalue $\mu$ of the perturbed matrix $A + E$ and the eigenvalues of the unperturbed matrix $A$. In particular, as $E$ increases from zero, it is difficult to predict how each eigenvalue of $A + E$ will move, and there is no way to predict which eigenvalue of the original $A$ corresponds to any particular eigenvalue $\mu$ of $A + E$, except in certain special cases such as symmetric or Hermitian matrices.
-However, in their original paper [[#References|[a1]]], F.L. Bauer and C.T. Fike presented several more general results to relieve the limitation to diagonalizable matrices. The most important of these is the following: Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006081.png" /> be an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006082.png" />-matrix, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006083.png" /> denote any eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006084.png" />. Then either <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006085.png" /> is also an eigenvalue of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006086.png" /> or
+However, in their original paper [[#References|[a1]]], F.L. Bauer and C.T. Fike presented several more general results to relieve the limitation to diagonalizable matrices. The most important of these is the following: Let $A$ be an arbitrary $( n \times n )$-matrix, and let $\mu$ denote any eigenvalue of $A + E$. Then either $\mu$ is also an eigenvalue of $A$ or
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006087.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+\begin{equation} \tag{a2} 1 \leq \| ( \mu I - A ) ^ { - 1 } \cdot E \| \leq \| ( \mu I - A ) ^ { - 1 } \| \cdot  \| E \| . \end{equation}
 This result follows from a simple matrix manipulation, using the norm properties given above:
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006088.png" /></td> </tr></table>
+\begin{equation*} ( A + E ) x = \mu x = ( \mu I ) x \Rightarrow \end{equation*}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006089.png" /></td> </tr></table>
+\begin{equation*} \Rightarrow ( \mu I - A ) ^ { - 1 } . E x = x \Rightarrow \Rightarrow \| ( \mu I - A ) ^ { - 1 } . E \| . \| x \| \geq \| x \|. \end{equation*}
-Even though this may appear to be a rather technical result, it actually leads to a great many often-used results. For example, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006090.png" /> is diagonalizable, it is easy to show that
+Even though this may appear to be a rather technical result, it actually leads to a great many often-used results. For example, if $A = V \Lambda V ^ { - 1 }$ is diagonalizable, it is easy to show that
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006091.png" /></td> </tr></table>
+\begin{equation*} \| ( \mu I - A ) ^ { - 1 } \| = \| V ( \mu I - A ) ^ { - 1 } V ^ { - 1 } \| \leq \end{equation*}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006092.png" /></td> </tr></table>
+\begin{equation*} \leq \| V \| \cdot \| ( \mu I - A ) ^ { - 1 } \| \cdot \| V ^ { - 1 } \|, \end{equation*}
-which leads immediately to the popular Bauer–Fike theorem (using the fact that the operator norm of a diagonal matrix is just its largest entry in absolute value). One can also repeat the construction, in which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006093.png" /> is an arbitrary matrix, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006094.png" /> is the matrix consisting of the diagonal entries of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006095.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006096.png" /> is the matrix of off- diagonal entries. Using the norm <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006097.png" />, the leftmost inequality in (a2) then reduces to
+which leads immediately to the popular Bauer–Fike theorem (using the fact that the operator norm of a diagonal matrix is just its largest entry in absolute value). One can also repeat the construction, in which $B$ is an arbitrary matrix, $A$ is the matrix consisting of the diagonal entries of $B$, and $E = B - A$ is the matrix of off- diagonal entries. Using the norm $\| . \| _ { \infty }$, the leftmost inequality in (a2) then reduces to
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b130/b130060/b13006098.png" /></td> </tr></table>
+\begin{equation*} 1 \leq \operatorname { max } _ { i } \left( \frac { 1 } { | \mu - b _ { i i } | } . \sum _ { j \neq i } | b _ { i j } | \right), \end{equation*}
 leading immediately to the first [[Gershgorin theorem|Gershgorin theorem]] (see also [[#References|[a2]]], Sec. 7.2.1.
 ====References====
-<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  F.L. Bauer,   C.T. Fike,   "Norms and exclusion theorems"  ''Numer. Math.'' , '''2'''  (1960)  pp. 137–141</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  G.H. Golub,   C.F. Van Loan,   "Matrix computations" , Johns Hopkins Univ. Press  (1996)  (Edition: Third)</TD></TR></table>
+<table><tr><td valign="top">[a1]</td> <td valign="top">  F.L. Bauer,   C.T. Fike,   "Norms and exclusion theorems"  ''Numer. Math.'' , '''2'''  (1960)  pp. 137–141</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  G.H. Golub,   C.F. Van Loan,   "Matrix computations" , Johns Hopkins Univ. Press  (1996)  (Edition: Third)</td></tr></table>

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Bauer-Fike theorem"

Latest revision as of 17:44, 1 July 2020

References