Namespaces
Variants
Actions

Difference between revisions of "Complete problem of eigen values"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
Line 1: Line 1:
 +
<!--
 +
c0238401.png
 +
$#A+1 = 129 n = 0
 +
$#C+1 = 129 : ~/encyclopedia/old_files/data/C023/C.0203840 Complete problem of eigen values,
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''complete eigen value problem''
 
''complete eigen value problem''
  
 
The problem of calculating all the eigen values (as distinct from the [[Partial problem of eigen values|partial problem of eigen values]]) of a square, usually real or complex, matrix. Often, one has not only to calculate the eigen values but also to construct a basis composed of eigen vectors, or root (principal) vectors, for the matrix.
 
The problem of calculating all the eigen values (as distinct from the [[Partial problem of eigen values|partial problem of eigen values]]) of a square, usually real or complex, matrix. Often, one has not only to calculate the eigen values but also to construct a basis composed of eigen vectors, or root (principal) vectors, for the matrix.
  
The solution to the complete problem of eigen values for a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238401.png" /> amounts theoretically to constructing the characteristic polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238402.png" /> for this matrix and calculating the, real or complex, roots of it (this makes it impossible to compute the eigen values by a finite computational process). For each eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238403.png" />, one can determine the corresponding eigen vectors from a homogeneous system of linear equations: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238404.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238405.png" /> is the identity matrix). In calculations over the field of complex numbers, a sufficient condition for the existence of a basis of eigen vectors is simplicity of the spectrum, while a necessary and sufficient condition is that the algebraic multiplicity of each eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238406.png" /> (i.e. its multiplicity as a root of the characteristic polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238407.png" />) coincides with its geometric multiplicity (i.e. the defect of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238408.png" />). If it is necessary to calculate the root (principal) vectors of multiplicity (degree) exceeding one, one has to consider homogeneous systems of the form
+
The solution to the complete problem of eigen values for a matrix $  A $
 +
amounts theoretically to constructing the characteristic polynomial $  \phi _ {A} $
 +
for this matrix and calculating the, real or complex, roots of it (this makes it impossible to compute the eigen values by a finite computational process). For each eigen value $  \lambda $,  
 +
one can determine the corresponding eigen vectors from a homogeneous system of linear equations: $  ( A- \lambda E) x = 0 $(
 +
$  E $
 +
is the identity matrix). In calculations over the field of complex numbers, a sufficient condition for the existence of a basis of eigen vectors is simplicity of the spectrum, while a necessary and sufficient condition is that the algebraic multiplicity of each eigen value $  \lambda $(
 +
i.e. its multiplicity as a root of the characteristic polynomial $  \phi _ {A} $)  
 +
coincides with its geometric multiplicity (i.e. the defect of the matrix $  A - \lambda E $).  
 +
If it is necessary to calculate the root (principal) vectors of multiplicity (degree) exceeding one, one has to consider homogeneous systems of the form
 +
 
 +
$$
 +
( A- \lambda E)  ^ {k} x  =  0,\ \
 +
k \in \mathbf N ,\  k > 1.
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c0238409.png" /></td> </tr></table>
+
According to this scheme numerical methods of solving the complete problem of eigen values were constructed up to the end of the 1940-s. In the 1930-s, highly effective algorithms (as regards the number of arithmetic operations) were developed for calculating the characteristic polynomial of a matrix in terms of its coefficients. For example, in Danilevskii's method, the computation of the characteristic polynomial for a matrix of order  $  n $
 +
involves about  $  \sim n  ^ {3} $
 +
multiplications [[#References|[1]]], [[#References|[2]]].
  
According to this scheme numerical methods of solving the complete problem of eigen values were constructed up to the end of the 1940-s. In the 1930-s, highly effective algorithms (as regards the number of arithmetic operations) were developed for calculating the characteristic polynomial of a matrix in terms of its coefficients. For example, in Danilevskii's method, the computation of the characteristic polynomial for a matrix of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384010.png" /> involves about <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384011.png" /> multiplications [[#References|[1]]], [[#References|[2]]].
+
Methods in this group have received the name of direct, or exact, ones for the reason that if carried out with exact arithmetic they give the exact values of the coefficients in the characteristic polynomial. The behaviour in real calculations with rounding-off errors could not be tested for problems of any substantial order until digital computers became available. Such tests were carried out in the 1950-s, and as a result, direct methods were completely displaced from numerical practice. There is a catastrophic instability in calculating the eigen values by these methods, and there are two major reasons for this. First, the coefficients in the characteristic polynomial in the majority of exact methods are determined directly or indirectly as components of the solution of a system of linear equations, the matrix of which is built up by columns from the vectors  $  v, Av \dots A  ^ {n-} 1 v $,  
 +
where  $  v $
 +
is the initial vector in the method. Such a matrix is usually very poorly conditioned, as is evident in particular from the fact that the lengths of the columns are usually very different, and this is the more so the larger  $  n $.
 +
Therefore, the coefficients in the characteristic polynomial in general are calculated with very large errors. Secondly, the calculation of the roots of a polynomial is frequently numerically unstable. In that respect, the following example is noteworthy [[#References|[3]]]: If in the polynomial
  
Methods in this group have received the name of direct, or exact, ones for the reason that if carried out with exact arithmetic they give the exact values of the coefficients in the characteristic polynomial. The behaviour in real calculations with rounding-off errors could not be tested for problems of any substantial order until digital computers became available. Such tests were carried out in the 1950-s, and as a result, direct methods were completely displaced from numerical practice. There is a catastrophic instability in calculating the eigen values by these methods, and there are two major reasons for this. First, the coefficients in the characteristic polynomial in the majority of exact methods are determined directly or indirectly as components of the solution of a system of linear equations, the matrix of which is built up by columns from the vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384012.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384013.png" /> is the initial vector in the method. Such a matrix is usually very poorly conditioned, as is evident in particular from the fact that the lengths of the columns are usually very different, and this is the more so the larger <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384014.png" />. Therefore, the coefficients in the characteristic polynomial in general are calculated with very large errors. Secondly, the calculation of the roots of a polynomial is frequently numerically unstable. In that respect, the following example is noteworthy [[#References|[3]]]: If in the polynomial
+
$$
 +
p( x)  =  ( x- 1)( x- 2) \dots ( x- 20) =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384015.png" /></td> </tr></table>
+
$$
 +
= \
 +
x  ^ {20} - 210x  ^ {19} + \dots
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384016.png" /></td> </tr></table>
+
one alters the coefficient of  $  x  ^ {19} $
 +
from  $  - 210 $
 +
to  $  - 210 + 2  ^ {-} 23 $,
 +
the perturbed polynomial  $  \widetilde{p}  ( x) $
 +
acquires five pairs of complex-conjugate roots; for one of these pairs, the imaginary parts are  $  \pm  i \cdot 2.81 \dots $.
  
one alters the coefficient of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384017.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384018.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384019.png" />, the perturbed polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384020.png" /> acquires five pairs of complex-conjugate roots; for one of these pairs, the imaginary parts are <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384021.png" />.
+
This sensitivity of the eigen values to perturbations in the coefficients of the characteristic polynomial is usually not accompanied by a comparable sensitivity with respect to the elements of the matrix itself. For example, if this polynomial  $  p( x) $
 +
is the characteristic polynomial of a symmetric matrix  $  A $,
 +
then perturbations of the order of  $  2  ^ {-} 23 $
 +
in any element in the matrix produce changes at most of the same order in the eigen values.
  
This sensitivity of the eigen values to perturbations in the coefficients of the characteristic polynomial is usually not accompanied by a comparable sensitivity with respect to the elements of the matrix itself. For example, if this polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384022.png" /> is the characteristic polynomial of a symmetric matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384023.png" />, then perturbations of the order of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384024.png" /> in any element in the matrix produce changes at most of the same order in the eigen values.
+
Modern numerical methods for solving the complete problem of eigen values yield their results without calculating the characteristic polynomial (see [[Iteration methods|Iteration methods]]). The number of multiplications involved in the best methods is  $  O ( kn  ^ {3} ) $,
 +
where  $  n $
 +
is the order of the matrix and  $  k $
 +
is a constant independent of  $  n $
 +
and having the meaning of the average number of iterations required to calculate one eigen value. The values of $  k $
 +
in the $  \textrm{ QR } $
 +
algorithm are usually between 1.5 and 2.
  
Modern numerical methods for solving the complete problem of eigen values yield their results without calculating the characteristic polynomial (see [[Iteration methods|Iteration methods]]). The number of multiplications involved in the best methods is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384025.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384026.png" /> is the order of the matrix and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384027.png" /> is a constant independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384028.png" /> and having the meaning of the average number of iterations required to calculate one eigen value. The values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384029.png" /> in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384030.png" /> algorithm are usually between 1.5 and 2.
+
Approximate eigen values (vectors) calculated for an ( $  n \times n $)-
 +
matrix  $  A $
 +
by a method  $  M $
 +
based on orthogonal transformations can be interpreted as the exact eigen values (vectors) of a perturbed matrix or matrices  $  A + F _ {M} $(
 +
the $  \mathop{\rm QR} $
 +
algorithm for matrices of general form, or Jacobi's method or methods based on dividing the spectrum for symmetric and Hermitian matrices are used). Here  $  F _ {M} $
 +
is the equivalent perturbation matrix for the method  $  M $
 +
that satisfies bounds of the form
  
Approximate eigen values (vectors) calculated for an (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384031.png" />)-matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384032.png" /> by a method <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384033.png" /> based on orthogonal transformations can be interpreted as the exact eigen values (vectors) of a perturbed matrix or matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384034.png" /> (the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384035.png" /> algorithm for matrices of general form, or Jacobi's method or methods based on dividing the spectrum for symmetric and Hermitian matrices are used). Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384036.png" /> is the equivalent perturbation matrix for the method <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384037.png" /> that satisfies bounds of the form
+
$$ \tag{1 }
 +
\| F _ {M} \| _ {E}  \leq  f( n) \| A \| _ {E} \epsilon ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384038.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
where  $  \epsilon $
 +
is the relative accuracy of the machine arithmetic,  $  \| A \| _ {E} = ( \sum | A _ {ij} |  ^ {2} )  ^ {1/2} $
 +
is the Euclidean matrix norm and  $  f( n) $
 +
is a function of the form  $  Ckn  ^  \alpha  $.  
 +
The number  $  k $
 +
has been described above, while the exact values of the constant  $  C $
 +
and the exponent  $  \alpha $
 +
depend on details of the computation such as the method of rounding-off, the use of scalar-product accumulation, etc. The usual value of  $  \alpha $
 +
is 2.
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384039.png" /> is the relative accuracy of the machine arithmetic, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384040.png" /> is the Euclidean matrix norm and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384041.png" /> is a function of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384042.png" />. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384043.png" /> has been described above, while the exact values of the constant <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384044.png" /> and the exponent <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384045.png" /> depend on details of the computation such as the method of rounding-off, the use of scalar-product accumulation, etc. The usual value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384046.png" /> is 2.
+
The a priori bound (1) allows one to estimate the accuracy in calculating the eigen values (vectors) by the method  $  M $.  
 +
This accuracy depends on the conditioning of the individual eigen values (eigen subspaces).
  
The a priori bound (1) allows one to estimate the accuracy in calculating the eigen values (vectors) by the method <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384047.png" />. This accuracy depends on the conditioning of the individual eigen values (eigen subspaces).
+
Let  $  \lambda $
 +
be a simple eigen value of a matrix  $  A $,
 +
let  $  x $
 +
be the corresponding normalized eigen vector, and let  $  y $
 +
be the normalized eigen vector of the transposed matrix  $  A  ^ {T} $
 +
for the same eigen value. When the matrix  $  A $
 +
is perturbed by a matrix  $  F $,
 +
the perturbation in the eigen value  $  \lambda $
 +
is expressed by the following quantity, accurate up to the second order:
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384048.png" /> be a simple eigen value of a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384049.png" />, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384050.png" /> be the corresponding normalized eigen vector, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384051.png" /> be the normalized eigen vector of the transposed matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384052.png" /> for the same eigen value. When the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384053.png" /> is perturbed by a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384054.png" />, the perturbation in the eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384055.png" /> is expressed by the following quantity, accurate up to the second order:
+
$$ \tag{2 }
 +
\Delta \lambda  \approx 
 +
\frac{( y  ^ {T} Fx) }{( y  ^ {T} x) }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384056.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$
  
 
and is estimated as
 
and is estimated as
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384057.png" /></td> <td valign="top" style="width:5%;text-align:right;">(3)</td></tr></table>
+
$$ \tag{3 }
 +
| \Delta \lambda |  \lce 
 +
\frac{\| F \| _ {2} }{| y  ^ {T} x | }
  
(<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384058.png" /> is the spectral norm). Then the sensitivity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384059.png" /> to perturbations in the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384060.png" /> is characterized by the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384061.png" />, which is called the (individual) condition number of that eigen value. When both vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384062.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384063.png" /> are real, the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384064.png" /> has a simple geometrical meaning: It is the cosine of the angle between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384065.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384066.png" />, which explains the other name of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384067.png" />, the distortion coefficient corresponding to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384068.png" />.
+
$$
  
If the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384069.png" /> is diagonalizable (i.e. has a basis of eigen vectors), then the conditioning of its eigen values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384070.png" /> can be characterized completely. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384071.png" /> be the matrix composed by the columns of eigen vectors at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384072.png" /> and having the least condition number among all such matrices. The following theorem applies [[#References|[4]]]: All the eigen values of the perturbed matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384073.png" /> are enclosed in the region in the complex plane that is the union of the circles
+
( $  \| \cdot \| _ {2} $
 +
is the spectral norm). Then the sensitivity of  $  \lambda $
 +
to perturbations in the matrix $  A $
 +
is characterized by the number  $  s( \lambda ) = | y  ^ {T} x |  ^ {-} 1 $,  
 +
which is called the (individual) condition number of that eigen value. When both vectors $  x $
 +
and $  y $
 +
are real, the number $  s  ^ {-} 1 ( \lambda ) $
 +
has a simple geometrical meaning: It is the cosine of the angle between  $  x $
 +
and  $  y $,
 +
which explains the other name of $  s( \lambda ) $,
 +
the distortion coefficient corresponding to  $  \lambda $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384074.png" /></td> <td valign="top" style="width:5%;text-align:right;">(4)</td></tr></table>
+
If the matrix  $  A $
 +
is diagonalizable (i.e. has a basis of eigen vectors), then the conditioning of its eigen values  $  \lambda _ {i} $
 +
can be characterized completely. Let  $  P $
 +
be the matrix composed by the columns of eigen vectors at  $  \lambda _ {i} $
 +
and having the least condition number among all such matrices. The following theorem applies [[#References|[4]]]: All the eigen values of the perturbed matrix  $  A + F $
 +
are enclosed in the region in the complex plane that is the union of the circles
  
If this region splits up into connected components, then each of them contains as many eigen values of the perturbed matrix as the number of circles constituting that component (as the norm in (4), one can take the spectral norm, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384075.png" /> can be taken as the spectral condition number).
+
$$ \tag{4 }
 +
| z - \lambda _ {i} |  \leq    \mathop{\rm cond}  P  \| F \| ,\ \
 +
i= 1 \dots n.
 +
$$
  
The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384076.png" /> is called the condition number of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384077.png" /> in relation to the complete problem of eigen values. Expressed in the spectral norm, it is related to the distortion coefficients as follows:
+
If this region splits up into connected components, then each of them contains as many eigen values of the perturbed matrix as the number of circles constituting that component (as the norm in (4), one can take the spectral norm, while  $  \mathop{\rm cond}  P $
 +
can be taken as the spectral condition number).
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384078.png" /></td> </tr></table>
+
The number  $  \mathop{\rm cond}  P $
 +
is called the condition number of the matrix  $  A $
 +
in relation to the complete problem of eigen values. Expressed in the spectral norm, it is related to the distortion coefficients as follows:
  
The perturbation in an eigen vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384079.png" /> related to the simple eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384080.png" /> depends on the perturbation of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384081.png" /> in a more complicated way. It is determined, generally speaking, not only by the distortion coefficient corresponding to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384082.png" /> but also by those for the other eigen values. The sensitivity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384083.png" /> also increases if there are eigen values close to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384084.png" />. In the limiting case where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384085.png" /> becomes a multiple eigen value, it becomes meaningless to consider the sensitivity of an individual eigen vector, and it is necessary to speak of the sensitivity of an eigen (or invariant) subspace.
+
$$
 +
s( \lambda _ {i} )  \leq    \mathop{\rm cond}  P  \leq  \sum _ { i= } 1 ^ { n }  s( \lambda _ {i} ).
 +
$$
  
Qualitatively, the estimates (3) and (4) mean that the perturbations in the eigen values of a diagonalizable matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384086.png" /> are proportional to the size of the perturbation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384087.png" />, while the proportionality factors are represented by the condition numbers, individual or global. If the Jordan form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384088.png" /> is non-diagonal and the eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384089.png" /> corresponds to an elementary divisor <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384090.png" />, then the perturbation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384091.png" /> in the general case is proportional not to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384092.png" />, but to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384093.png" />.
+
The perturbation in an eigen vector  $  x $
 +
related to the simple eigen value  $  \lambda $
 +
depends on the perturbation of the matrix  $  A $
 +
in a more complicated way. It is determined, generally speaking, not only by the distortion coefficient corresponding to  $  \lambda $
 +
but also by those for the other eigen values. The sensitivity of $  x $
 +
also increases if there are eigen values close to  $  \lambda $.  
 +
In the limiting case where  $  \lambda $
 +
becomes a multiple eigen value, it becomes meaningless to consider the sensitivity of an individual eigen vector, and it is necessary to speak of the sensitivity of an eigen (or invariant) subspace.
  
The most important particular case of the complete problem of eigen values is to calculate all eigen values (eigen vectors) of a real symmetric or complex Hermitian matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384094.png" />. The distortion coefficients for such a matrix are equal to one, and the approximate estimate (3) becomes an exact inequality:
+
Qualitatively, the estimates (3) and (4) mean that the perturbations in the eigen values of a diagonalizable matrix $  A $
 +
are proportional to the size of the perturbation  $  F $,
 +
while the proportionality factors are represented by the condition numbers, individual or global. If the Jordan form of  $  A $
 +
is non-diagonal and the eigen value  $  \lambda $
 +
corresponds to an elementary divisor  $  ( z - \lambda ) ^ {m} $,
 +
then the perturbation of  $  \lambda $
 +
in the general case is proportional not to  $  \| F \| $,
 +
but to  $  \| F \|  ^ {1/m} $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384095.png" /></td> </tr></table>
+
The most important particular case of the complete problem of eigen values is to calculate all eigen values (eigen vectors) of a real symmetric or complex Hermitian matrix  $  A $.
 +
The distortion coefficients for such a matrix are equal to one, and the approximate estimate (3) becomes an exact inequality:
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384096.png" /> in (4) can be taken as orthogonal or unitary, and therefore the global condition number in the spectral norm is equal to one. No matter what the multiplicity of the points in the spectrum of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384097.png" />, there exists an ordering of the eigen values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384098.png" /> of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c02384099.png" /> such that for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840100.png" />:
+
$$
 +
| \Delta \lambda |  \leq  \| F \| _ {2} .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840101.png" /></td> </tr></table>
+
The matrix  $  P $
 +
in (4) can be taken as orthogonal or unitary, and therefore the global condition number in the spectral norm is equal to one. No matter what the multiplicity of the points in the spectrum of  $  A $,
 +
there exists an ordering of the eigen values  $  \widetilde \lambda  _ {i} $
 +
of the matrix  $  A+ F $
 +
such that for all  $  i $:
  
Even sharper bounds are possible if not only the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840102.png" /> itself is symmetric (Hermitian) but the same applies to the perturbation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840103.png" /> [[#References|[5]]].
+
$$
 +
| \widetilde \lambda  _ {i} - \lambda _ {i} |  \leq  \| F \| _ {2} .
 +
$$
  
In addition, there are some a posteriori estimates for the accuracy of the calculated eigen values and eigen vectors. They are most effective for symmetric and Hermitian matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840104.png" />.
+
Even sharper bounds are possible if not only the matrix  $  A $
 +
itself is symmetric (Hermitian) but the same applies to the perturbation  $  F $[[#References|[5]]].
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840105.png" /> be an approximate eigen vector and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840106.png" /> be the exact eigen vector at a simple eigen value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840107.png" /> of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840108.png" />. Both vectors are assumed to be normalized. The best estimate of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840109.png" /> that can be obtained by means of the vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840110.png" /> is given by the value of the Rayleigh functional
+
In addition, there are some a posteriori estimates for the accuracy of the calculated eigen values and eigen vectors. They are most effective for symmetric and Hermitian matrices  $  A $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840111.png" /></td> </tr></table>
+
Let  $  \widetilde{x}  _ {i} $
 +
be an approximate eigen vector and let  $  x _ {i} $
 +
be the exact eigen vector at a simple eigen value  $  \lambda $
 +
of the matrix  $  A $.  
 +
Both vectors are assumed to be normalized. The best estimate of  $  \lambda _ {i} $
 +
that can be obtained by means of the vector  $  \widetilde{x}  _ {i} $
 +
is given by the value of the Rayleigh functional
  
corresponding to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840112.png" />, i.e. the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840113.png" /> (this applies for any matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840114.png" />, not only Hermitian ones). The vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840115.png" /> is called the discrepancy vector (or residual).
+
$$
 +
\phi ( A, z)  = 
 +
\frac{( Az , z ) }{( z , z ) }
 +
 
 +
$$
 +
 
 +
corresponding to $  \widetilde{x}  _ {i} $,  
 +
i.e. the number $  \mu _ {i} = \phi ( A, \widetilde{x}  _ {i} ) $(
 +
this applies for any matrix $  A $,  
 +
not only Hermitian ones). The vector $  r _ {i} = A \widetilde{x}  _ {i} - \mu _ {i} \widetilde{x}  _ {i} $
 +
is called the discrepancy vector (or residual).
  
 
Let
 
Let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840116.png" /></td> </tr></table>
+
$$
 +
\epsilon  = \| r _ {i} \| _ {2} ,\ \
 +
= \min _ {j \neq i }  | \lambda _ {j} - \mu _ {i} | .
 +
$$
  
A bound for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840117.png" /> can readily be derived from the calculated eigen values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840118.png" />. The following apply:
+
A bound for $  a $
 +
can readily be derived from the calculated eigen values $  \widetilde \lambda  _ {j} $.  
 +
The following apply:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840119.png" /></td> </tr></table>
+
$$
 +
| \lambda _ {i} - \mu _ {i} |  \leq 
 +
\frac{\epsilon  ^ {2} }{a}
 +
,\ \
 +
| \sin  \ang ( x _ {i} , \widetilde{x}  _ {i} ) |  \leq 
 +
\frac \epsilon {a}
 +
.
 +
$$
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840120.png" /> is a multiple eigen value or if there is a group of eigen values near to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840121.png" />, it is necessary to estimate the over-all perturbation for the whole group and the perturbation for the corresponding invariant subspace [[#References|[5]]].
+
If $  \lambda _ {i} $
 +
is a multiple eigen value or if there is a group of eigen values near to $  \lambda _ {i} $,  
 +
it is necessary to estimate the over-all perturbation for the whole group and the perturbation for the corresponding invariant subspace [[#References|[5]]].
  
 
Along with this ordinary complete eigen value problem, it is often also necessary to solve the so-called generalized eigen value problem:
 
Along with this ordinary complete eigen value problem, it is often also necessary to solve the so-called generalized eigen value problem:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840122.png" /></td> <td valign="top" style="width:5%;text-align:right;">(5)</td></tr></table>
+
$$ \tag{5 }
 +
Ax  = \lambda Bx.
 +
$$
  
In the most important case, the matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840123.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840124.png" /> are symmetric (Hermitian) and one of them is positive definite. The theory and methods for numerically solving (5) are parallel to those for an ordinary symmetric (Hermitian) eigen value problem.
+
In the most important case, the matrices $  A $
 +
and $  B $
 +
are symmetric (Hermitian) and one of them is positive definite. The theory and methods for numerically solving (5) are parallel to those for an ordinary symmetric (Hermitian) eigen value problem.
  
If neither of the matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840125.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840126.png" /> is definite or if at least one of them is not symmetric, one uses the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840127.png" /> algorithm [[#References|[6]]], which is a generalization of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840128.png" /> algorithm. Here new effects can occur that are not observed in ordinary complete eigen value problems: infinite eigen values and continuous spectra.
+
If neither of the matrices $  A $
 +
and $  B $
 +
is definite or if at least one of them is not symmetric, one uses the $  \mathop{\rm QZ} $
 +
algorithm [[#References|[6]]], which is a generalization of the $  \mathop{\rm QR} $
 +
algorithm. Here new effects can occur that are not observed in ordinary complete eigen value problems: infinite eigen values and continuous spectra.
  
 
Even more complicated features occur in non-linear eigen value problems:
 
Even more complicated features occur in non-linear eigen value problems:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c023/c023840/c023840129.png" /></td> </tr></table>
+
$$
 +
( A _ {n} \lambda  ^ {n} + A _ {n-} 1 \lambda  ^ {n-} 1 + \dots + A _ {0} ) x  = 0.
 +
$$
  
 
Such problems are usually solved by reducing them to linear ones of a higher order [[#References|[7]]].
 
Such problems are usually solved by reducing them to linear ones of a higher order [[#References|[7]]].
Line 97: Line 262:
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.M. Danilevskii,  ''Mat. Sb.'' , '''2''' :  1  (1937)  pp. 169–171</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  J.H. Wilkinson,  "Rounding errors in algebraic processes" , Prentice-Hall  (1963)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  J.H. Wilkinson,  "The algebraic eigenvalue problem" , Oxford Univ. Press  (1969)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  B.N. Parlett,  "The symmetric eigenvalue problem" , Prentice-Hall  (1980)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  C.B. Moler,  G.W. Stewart,  "An algorithm for generalized matrix eigenvalue problems"  ''Siam J. Numer. Anal.'' , '''10'''  (1973)  pp. 241–256</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  V.N. Kublanovskaya,  "Numerical methods of linear algebra" , Novosibirsk  (1980)  pp. 37–53  (In Russian)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.M. Danilevskii,  ''Mat. Sb.'' , '''2''' :  1  (1937)  pp. 169–171</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  J.H. Wilkinson,  "Rounding errors in algebraic processes" , Prentice-Hall  (1963)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  J.H. Wilkinson,  "The algebraic eigenvalue problem" , Oxford Univ. Press  (1969)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  B.N. Parlett,  "The symmetric eigenvalue problem" , Prentice-Hall  (1980)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  C.B. Moler,  G.W. Stewart,  "An algorithm for generalized matrix eigenvalue problems"  ''Siam J. Numer. Anal.'' , '''10'''  (1973)  pp. 241–256</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  V.N. Kublanovskaya,  "Numerical methods of linear algebra" , Novosibirsk  (1980)  pp. 37–53  (In Russian)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====

Revision as of 17:45, 4 June 2020


complete eigen value problem

The problem of calculating all the eigen values (as distinct from the partial problem of eigen values) of a square, usually real or complex, matrix. Often, one has not only to calculate the eigen values but also to construct a basis composed of eigen vectors, or root (principal) vectors, for the matrix.

The solution to the complete problem of eigen values for a matrix $ A $ amounts theoretically to constructing the characteristic polynomial $ \phi _ {A} $ for this matrix and calculating the, real or complex, roots of it (this makes it impossible to compute the eigen values by a finite computational process). For each eigen value $ \lambda $, one can determine the corresponding eigen vectors from a homogeneous system of linear equations: $ ( A- \lambda E) x = 0 $( $ E $ is the identity matrix). In calculations over the field of complex numbers, a sufficient condition for the existence of a basis of eigen vectors is simplicity of the spectrum, while a necessary and sufficient condition is that the algebraic multiplicity of each eigen value $ \lambda $( i.e. its multiplicity as a root of the characteristic polynomial $ \phi _ {A} $) coincides with its geometric multiplicity (i.e. the defect of the matrix $ A - \lambda E $). If it is necessary to calculate the root (principal) vectors of multiplicity (degree) exceeding one, one has to consider homogeneous systems of the form

$$ ( A- \lambda E) ^ {k} x = 0,\ \ k \in \mathbf N ,\ k > 1. $$

According to this scheme numerical methods of solving the complete problem of eigen values were constructed up to the end of the 1940-s. In the 1930-s, highly effective algorithms (as regards the number of arithmetic operations) were developed for calculating the characteristic polynomial of a matrix in terms of its coefficients. For example, in Danilevskii's method, the computation of the characteristic polynomial for a matrix of order $ n $ involves about $ \sim n ^ {3} $ multiplications [1], [2].

Methods in this group have received the name of direct, or exact, ones for the reason that if carried out with exact arithmetic they give the exact values of the coefficients in the characteristic polynomial. The behaviour in real calculations with rounding-off errors could not be tested for problems of any substantial order until digital computers became available. Such tests were carried out in the 1950-s, and as a result, direct methods were completely displaced from numerical practice. There is a catastrophic instability in calculating the eigen values by these methods, and there are two major reasons for this. First, the coefficients in the characteristic polynomial in the majority of exact methods are determined directly or indirectly as components of the solution of a system of linear equations, the matrix of which is built up by columns from the vectors $ v, Av \dots A ^ {n-} 1 v $, where $ v $ is the initial vector in the method. Such a matrix is usually very poorly conditioned, as is evident in particular from the fact that the lengths of the columns are usually very different, and this is the more so the larger $ n $. Therefore, the coefficients in the characteristic polynomial in general are calculated with very large errors. Secondly, the calculation of the roots of a polynomial is frequently numerically unstable. In that respect, the following example is noteworthy [3]: If in the polynomial

$$ p( x) = ( x- 1)( x- 2) \dots ( x- 20) = $$

$$ = \ x ^ {20} - 210x ^ {19} + \dots $$

one alters the coefficient of $ x ^ {19} $ from $ - 210 $ to $ - 210 + 2 ^ {-} 23 $, the perturbed polynomial $ \widetilde{p} ( x) $ acquires five pairs of complex-conjugate roots; for one of these pairs, the imaginary parts are $ \pm i \cdot 2.81 \dots $.

This sensitivity of the eigen values to perturbations in the coefficients of the characteristic polynomial is usually not accompanied by a comparable sensitivity with respect to the elements of the matrix itself. For example, if this polynomial $ p( x) $ is the characteristic polynomial of a symmetric matrix $ A $, then perturbations of the order of $ 2 ^ {-} 23 $ in any element in the matrix produce changes at most of the same order in the eigen values.

Modern numerical methods for solving the complete problem of eigen values yield their results without calculating the characteristic polynomial (see Iteration methods). The number of multiplications involved in the best methods is $ O ( kn ^ {3} ) $, where $ n $ is the order of the matrix and $ k $ is a constant independent of $ n $ and having the meaning of the average number of iterations required to calculate one eigen value. The values of $ k $ in the $ \textrm{ QR } $ algorithm are usually between 1.5 and 2.

Approximate eigen values (vectors) calculated for an ( $ n \times n $)- matrix $ A $ by a method $ M $ based on orthogonal transformations can be interpreted as the exact eigen values (vectors) of a perturbed matrix or matrices $ A + F _ {M} $( the $ \mathop{\rm QR} $ algorithm for matrices of general form, or Jacobi's method or methods based on dividing the spectrum for symmetric and Hermitian matrices are used). Here $ F _ {M} $ is the equivalent perturbation matrix for the method $ M $ that satisfies bounds of the form

$$ \tag{1 } \| F _ {M} \| _ {E} \leq f( n) \| A \| _ {E} \epsilon , $$

where $ \epsilon $ is the relative accuracy of the machine arithmetic, $ \| A \| _ {E} = ( \sum | A _ {ij} | ^ {2} ) ^ {1/2} $ is the Euclidean matrix norm and $ f( n) $ is a function of the form $ Ckn ^ \alpha $. The number $ k $ has been described above, while the exact values of the constant $ C $ and the exponent $ \alpha $ depend on details of the computation such as the method of rounding-off, the use of scalar-product accumulation, etc. The usual value of $ \alpha $ is 2.

The a priori bound (1) allows one to estimate the accuracy in calculating the eigen values (vectors) by the method $ M $. This accuracy depends on the conditioning of the individual eigen values (eigen subspaces).

Let $ \lambda $ be a simple eigen value of a matrix $ A $, let $ x $ be the corresponding normalized eigen vector, and let $ y $ be the normalized eigen vector of the transposed matrix $ A ^ {T} $ for the same eigen value. When the matrix $ A $ is perturbed by a matrix $ F $, the perturbation in the eigen value $ \lambda $ is expressed by the following quantity, accurate up to the second order:

$$ \tag{2 } \Delta \lambda \approx \frac{( y ^ {T} Fx) }{( y ^ {T} x) } $$

and is estimated as

$$ \tag{3 } | \Delta \lambda | \lce \frac{\| F \| _ {2} }{| y ^ {T} x | } $$

( $ \| \cdot \| _ {2} $ is the spectral norm). Then the sensitivity of $ \lambda $ to perturbations in the matrix $ A $ is characterized by the number $ s( \lambda ) = | y ^ {T} x | ^ {-} 1 $, which is called the (individual) condition number of that eigen value. When both vectors $ x $ and $ y $ are real, the number $ s ^ {-} 1 ( \lambda ) $ has a simple geometrical meaning: It is the cosine of the angle between $ x $ and $ y $, which explains the other name of $ s( \lambda ) $, the distortion coefficient corresponding to $ \lambda $.

If the matrix $ A $ is diagonalizable (i.e. has a basis of eigen vectors), then the conditioning of its eigen values $ \lambda _ {i} $ can be characterized completely. Let $ P $ be the matrix composed by the columns of eigen vectors at $ \lambda _ {i} $ and having the least condition number among all such matrices. The following theorem applies [4]: All the eigen values of the perturbed matrix $ A + F $ are enclosed in the region in the complex plane that is the union of the circles

$$ \tag{4 } | z - \lambda _ {i} | \leq \mathop{\rm cond} P \| F \| ,\ \ i= 1 \dots n. $$

If this region splits up into connected components, then each of them contains as many eigen values of the perturbed matrix as the number of circles constituting that component (as the norm in (4), one can take the spectral norm, while $ \mathop{\rm cond} P $ can be taken as the spectral condition number).

The number $ \mathop{\rm cond} P $ is called the condition number of the matrix $ A $ in relation to the complete problem of eigen values. Expressed in the spectral norm, it is related to the distortion coefficients as follows:

$$ s( \lambda _ {i} ) \leq \mathop{\rm cond} P \leq \sum _ { i= } 1 ^ { n } s( \lambda _ {i} ). $$

The perturbation in an eigen vector $ x $ related to the simple eigen value $ \lambda $ depends on the perturbation of the matrix $ A $ in a more complicated way. It is determined, generally speaking, not only by the distortion coefficient corresponding to $ \lambda $ but also by those for the other eigen values. The sensitivity of $ x $ also increases if there are eigen values close to $ \lambda $. In the limiting case where $ \lambda $ becomes a multiple eigen value, it becomes meaningless to consider the sensitivity of an individual eigen vector, and it is necessary to speak of the sensitivity of an eigen (or invariant) subspace.

Qualitatively, the estimates (3) and (4) mean that the perturbations in the eigen values of a diagonalizable matrix $ A $ are proportional to the size of the perturbation $ F $, while the proportionality factors are represented by the condition numbers, individual or global. If the Jordan form of $ A $ is non-diagonal and the eigen value $ \lambda $ corresponds to an elementary divisor $ ( z - \lambda ) ^ {m} $, then the perturbation of $ \lambda $ in the general case is proportional not to $ \| F \| $, but to $ \| F \| ^ {1/m} $.

The most important particular case of the complete problem of eigen values is to calculate all eigen values (eigen vectors) of a real symmetric or complex Hermitian matrix $ A $. The distortion coefficients for such a matrix are equal to one, and the approximate estimate (3) becomes an exact inequality:

$$ | \Delta \lambda | \leq \| F \| _ {2} . $$

The matrix $ P $ in (4) can be taken as orthogonal or unitary, and therefore the global condition number in the spectral norm is equal to one. No matter what the multiplicity of the points in the spectrum of $ A $, there exists an ordering of the eigen values $ \widetilde \lambda _ {i} $ of the matrix $ A+ F $ such that for all $ i $:

$$ | \widetilde \lambda _ {i} - \lambda _ {i} | \leq \| F \| _ {2} . $$

Even sharper bounds are possible if not only the matrix $ A $ itself is symmetric (Hermitian) but the same applies to the perturbation $ F $[5].

In addition, there are some a posteriori estimates for the accuracy of the calculated eigen values and eigen vectors. They are most effective for symmetric and Hermitian matrices $ A $.

Let $ \widetilde{x} _ {i} $ be an approximate eigen vector and let $ x _ {i} $ be the exact eigen vector at a simple eigen value $ \lambda $ of the matrix $ A $. Both vectors are assumed to be normalized. The best estimate of $ \lambda _ {i} $ that can be obtained by means of the vector $ \widetilde{x} _ {i} $ is given by the value of the Rayleigh functional

$$ \phi ( A, z) = \frac{( Az , z ) }{( z , z ) } $$

corresponding to $ \widetilde{x} _ {i} $, i.e. the number $ \mu _ {i} = \phi ( A, \widetilde{x} _ {i} ) $( this applies for any matrix $ A $, not only Hermitian ones). The vector $ r _ {i} = A \widetilde{x} _ {i} - \mu _ {i} \widetilde{x} _ {i} $ is called the discrepancy vector (or residual).

Let

$$ \epsilon = \| r _ {i} \| _ {2} ,\ \ a = \min _ {j \neq i } | \lambda _ {j} - \mu _ {i} | . $$

A bound for $ a $ can readily be derived from the calculated eigen values $ \widetilde \lambda _ {j} $. The following apply:

$$ | \lambda _ {i} - \mu _ {i} | \leq \frac{\epsilon ^ {2} }{a} ,\ \ | \sin \ang ( x _ {i} , \widetilde{x} _ {i} ) | \leq \frac \epsilon {a} . $$

If $ \lambda _ {i} $ is a multiple eigen value or if there is a group of eigen values near to $ \lambda _ {i} $, it is necessary to estimate the over-all perturbation for the whole group and the perturbation for the corresponding invariant subspace [5].

Along with this ordinary complete eigen value problem, it is often also necessary to solve the so-called generalized eigen value problem:

$$ \tag{5 } Ax = \lambda Bx. $$

In the most important case, the matrices $ A $ and $ B $ are symmetric (Hermitian) and one of them is positive definite. The theory and methods for numerically solving (5) are parallel to those for an ordinary symmetric (Hermitian) eigen value problem.

If neither of the matrices $ A $ and $ B $ is definite or if at least one of them is not symmetric, one uses the $ \mathop{\rm QZ} $ algorithm [6], which is a generalization of the $ \mathop{\rm QR} $ algorithm. Here new effects can occur that are not observed in ordinary complete eigen value problems: infinite eigen values and continuous spectra.

Even more complicated features occur in non-linear eigen value problems:

$$ ( A _ {n} \lambda ^ {n} + A _ {n-} 1 \lambda ^ {n-} 1 + \dots + A _ {0} ) x = 0. $$

Such problems are usually solved by reducing them to linear ones of a higher order [7].

References

[1] A.M. Danilevskii, Mat. Sb. , 2 : 1 (1937) pp. 169–171
[2] D.K. Faddeev, V.N. Faddeeva, "Computational methods of linear algebra" , Freeman (1963) (Translated from Russian)
[3] J.H. Wilkinson, "Rounding errors in algebraic processes" , Prentice-Hall (1963)
[4] J.H. Wilkinson, "The algebraic eigenvalue problem" , Oxford Univ. Press (1969)
[5] B.N. Parlett, "The symmetric eigenvalue problem" , Prentice-Hall (1980)
[6] C.B. Moler, G.W. Stewart, "An algorithm for generalized matrix eigenvalue problems" Siam J. Numer. Anal. , 10 (1973) pp. 241–256
[7] V.N. Kublanovskaya, "Numerical methods of linear algebra" , Novosibirsk (1980) pp. 37–53 (In Russian)

Comments

In recent years much research has been carried out on the eigen value problem for sparse matrices (cf. Sparse matrix). Because of storage problems, the challenges are even greater here. For the symmetric (Hermitian) case the Lanczos algorithm has proved very successful.

Cf. Iteration methods for a description of the QR algorithm and other iterative algorithms for calculating eigen values.

References

[a1] G.H. Golub, C.F. van Loan, "Matrix computations" , North Oxford Acad. (1983)
How to Cite This Entry:
Complete problem of eigen values. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Complete_problem_of_eigen_values&oldid=46422
This article was adapted from an original article by Kh.D. Ikramov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article