Namespaces
Variants
Actions

Difference between revisions of "Inversion of a matrix"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
i0524401.png
 +
$#A+1 = 76 n = 0
 +
$#C+1 = 76 : ~/encyclopedia/old_files/data/I052/I.0502440 Inversion of a matrix
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
An algorithm applicable for the numerical computation of an [[Inverse matrix|inverse matrix]]. As for the solution of linear systems, methods for numerical inversion can be subdivided into direct and iterative methods; however, iterative methods play a considerably smaller role here because of their laboriousness.
 
An algorithm applicable for the numerical computation of an [[Inverse matrix|inverse matrix]]. As for the solution of linear systems, methods for numerical inversion can be subdivided into direct and iterative methods; however, iterative methods play a considerably smaller role here because of their laboriousness.
  
 
Most of the direct methods for the inversion of a matrix are based on the idea of decomposing the given matrix into easily invertible factors. If
 
Most of the direct methods for the inversion of a matrix are based on the idea of decomposing the given matrix into easily invertible factors. If
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524401.png" /></td> </tr></table>
+
$$
 +
= L _ {1} \dots L _ {k}  $$
  
 
is such a decomposition, then
 
is such a decomposition, then
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524402.png" /></td> </tr></table>
+
$$
 +
A  ^ {-} 1  = L _ {k}  ^ {-} 1 \dots L _ {1}  ^ {-} 1 .
 +
$$
  
 
The Jordan method (see [[#References|[1]]]) is a typical representative of the direct methods (and one of the most generally used).
 
The Jordan method (see [[#References|[1]]]) is a typical representative of the direct methods (and one of the most generally used).
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524403.png" /> be a non-singular matrix of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524404.png" />. The construction of the inverse matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524405.png" /> proceeds in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524406.png" /> steps; the result of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524407.png" />-th step is a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524408.png" /> whose first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i0524409.png" /> columns are the same as the corresponding columns of the identity matrix. The passage from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244010.png" /> (let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244011.png" />) to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244012.png" /> from a matrix point of view is equivalent to left multiplication of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244013.png" /> by a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244014.png" /> which differs from the identity matrix only in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244015.png" />-st column. The elements of this column are chosen in order to reduce the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244016.png" />-st column of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244017.png" /> to the identity, and they have the form
+
Let $  A $
 +
be a non-singular matrix of order $  n $.  
 +
The construction of the inverse matrix $  A  ^ {-} 1 $
 +
proceeds in $  n $
 +
steps; the result of the $  k $-
 +
th step is a matrix $  A _ {k} $
 +
whose first $  k $
 +
columns are the same as the corresponding columns of the identity matrix. The passage from $  A _ {k} $(
 +
let $  A = A _ {0} $)  
 +
to $  A _ {k+} 1 $
 +
from a matrix point of view is equivalent to left multiplication of $  A $
 +
by a matrix $  L _ {k+} 1 $
 +
which differs from the identity matrix only in the $  ( k + 1 ) $-
 +
st column. The elements of this column are chosen in order to reduce the $  ( k + 1 ) $-
 +
st column of $  A _ {k} $
 +
to the identity, and they have the form
 +
 
 +
$$
 +
-
 +
\frac{a _ {1,k+} 1  ^ {(} k) }{a _ {k + 1 , k + 1 }  ^ {(} k) }
 +
\dots -
 +
 
 +
\frac{a _ {k , k + 1 }  ^ {(} k) }{a _ {k + 1 , k + 1 }  ^ {(} k) }
 +
,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244018.png" /></td> </tr></table>
+
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244019.png" /></td> </tr></table>
+
\frac{1}{a _ {k + 1 , k + 1 }  ^ {(} k) }
 +
\dots -  
 +
\frac{a _ {n
 +
, k + 1 }  ^ {(} k) }{a _ {k + 1 , k + 1 }  ^ {(} k) }
 +
.
 +
$$
  
 
From the relations
 
From the relations
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244020.png" /></td> </tr></table>
+
$$
 +
A _ {k+} 1  = L _ {k+} 1 A _ {k} ,\  A _ {n}  = E
 +
$$
  
 
it follows that
 
it follows that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244021.png" /></td> </tr></table>
+
$$
 +
L _ {n} \dots L _ {1} A  = E
 +
$$
  
 
and
 
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244022.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
$$ \tag{1 }
 +
A  ^ {-} 1  = L _ {n} \dots L _ {1} .
 +
$$
 +
 
 +
To obtain the factorization (1) for the inverse matrix  $  A  ^ {-} 1 $
 +
requires about  $  n  ^ {3} /2 $
 +
multiplications and about  $  n  ^ {3} /2 $
 +
additions. Approximately the same number of operations is necessary to multiply out the matrices in (1) to obtain the explicit form for  $  A  ^ {-} 1 $.  
 +
In many applications of matrix inversion the use of (1) is just as satisfactory as that of the explicit form. For example, the computation of the product  $  A  ^ {-} 1 b $,
 +
where  $  b $
 +
is a column vector, requires the same arithmetical work in both cases. The memory requirements when implemented on a computer are also the same.
 +
 
 +
In the above description of the Jordan method it was assumed for simplicity that all the elements  $  a _ {k + 1 , k + 1 }  ^ {(} k) $(
 +
called the pivots) were non-zero. In reality the Jordan method, like the methods of Gauss type for the solution of linear systems (cf. also [[Gauss method|Gauss method]]), is usually applied with one or another scheme for choosing the pivots. The use of such a scheme is equivalent to introducing additional factors in (1) which take account of the permutations of the rows and columns of the inverse matrix. The accuracy of the computed solution, as in the case of linear systems, depends on the rate of growth of the matrix entries in the intermediate steps of the method. Such growth, with subsequent deterioration of the accuracy in the computed solution, is more probable in the Jordan method, even with choice of pivots, than in methods of Gauss type.
  
To obtain the factorization (1) for the inverse matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244023.png" /> requires about <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244024.png" /> multiplications and about <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244025.png" /> additions. Approximately the same number of operations is necessary to multiply out the matrices in (1) to obtain the explicit form for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244026.png" />. In many applications of matrix inversion the use of (1) is just as satisfactory as that of the explicit form. For example, the computation of the product <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244027.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244028.png" /> is a column vector, requires the same arithmetical work in both cases. The memory requirements when implemented on a computer are also the same.
+
The matrix $  R = E - A X $
 +
is called the residual corresponding to the approximate inverse  $  X $
 +
for $  A $.  
 +
One has the estimate
  
In the above description of the Jordan method it was assumed for simplicity that all the elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244029.png" /> (called the pivots) were non-zero. In reality the Jordan method, like the methods of Gauss type for the solution of linear systems (cf. also [[Gauss method|Gauss method]]), is usually applied with one or another scheme for choosing the pivots. The use of such a scheme is equivalent to introducing additional factors in (1) which take account of the permutations of the rows and columns of the inverse matrix. The accuracy of the computed solution, as in the case of linear systems, depends on the rate of growth of the matrix entries in the intermediate steps of the method. Such growth, with subsequent deterioration of the accuracy in the computed solution, is more probable in the Jordan method, even with choice of pivots, than in methods of Gauss type.
+
$$
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244030.png" /> is called the residual corresponding to the approximate inverse <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244031.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244032.png" />. One has the estimate
+
\frac{\| A  ^ {-} 1 - X \| }{\| A  ^ {-} 1 \| }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244033.png" /></td> </tr></table>
+
\leq  \| R \| .
 +
$$
  
 
Thus, the norm of the residual is a bound for the relative precision of the approximate inverse matrix. This is an important difference between the problem of numerical inversion of a matrix and the solution of linear systems, where (for example, in orthogonal methods or methods of Gauss type) the residual is usually small, and the quality of the solution obtained depends on the conditioning of the system.
 
Thus, the norm of the residual is a bound for the relative precision of the approximate inverse matrix. This is an important difference between the problem of numerical inversion of a matrix and the solution of linear systems, where (for example, in orthogonal methods or methods of Gauss type) the residual is usually small, and the quality of the solution obtained depends on the conditioning of the system.
  
The inversion of several important classes of matrices can be achieved by methods that are significantly more economical than in the general case. Such classes include Toeplitz, Hankel, banded (and in particular tri-diagonal) matrices, block matrices having a Toeplitz structure or the structure of Kronecker products, etc. For example, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244034.png" /> be a [[Toeplitz matrix|Toeplitz matrix]] of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244035.png" /> with entries from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244036.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244037.png" />:
+
The inversion of several important classes of matrices can be achieved by methods that are significantly more economical than in the general case. Such classes include Toeplitz, Hankel, banded (and in particular tri-diagonal) matrices, block matrices having a Toeplitz structure or the structure of Kronecker products, etc. For example, let $  T $
 +
be a [[Toeplitz matrix|Toeplitz matrix]] of order $  n + 1 $
 +
with entries from $  \mathbf R $
 +
or  $  \mathbf C $:
 +
 
 +
$$
 +
= \
 +
\left \|
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244038.png" /></td> </tr></table>
+
\begin{array}{cccc}
 +
t _ {0}  &t _ {1}  &\dots  &t _ {n}  \\
 +
t _ {-} 1  &t _ {0}  &\dots  &t _ {n-} 1  \\
 +
\cdot  &\cdot  &\dots  &\cdot  \\
 +
t _ {-} n  &t _ {-} n+ 1  &\dots  &t _ {0}  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
Assume that not only <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244039.png" />, but also its principal submatrices of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244040.png" /> are non-singular. Then the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244041.png" />, which, in general, is already not Toeplitz, has the representation (see [[#References|[2]]]):
+
Assume that not only $  T $,  
 +
but also its principal submatrices of order $  n $
 +
are non-singular. Then the matrix $  T  ^ {-} 1 $,  
 +
which, in general, is already not Toeplitz, has the representation (see [[#References|[2]]]):
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244042.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
T  ^ {-} 1 =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244043.png" /></td> </tr></table>
+
$$
 +
= \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244044.png" /></td> </tr></table>
+
\frac{1}{p} _ {n} \left \|
 +
\begin{array}{cccc}
 +
1  & 0  &\cdot  & 0  \\
 +
b _ {1}  & 1  &{}  &\cdot  \\
 +
\cdot  &{}  &{}  &\cdot  \\
 +
b _ {n}  &\cdot  &b _ {1}  & 1  \\
 +
\end{array}
 +
\right \|
 +
\cdot \left \|
 +
\begin{array}{cccc}
 +
1  &a _ {1}  &\cdot  &a _ {n}  \\
 +
0  & 1  &{}  &\cdot  \\
 +
\cdot  &{}  &{}  &a _ {1}  \\
 +
0  &\cdot  & 0  & 1  \\
 +
\end{array}
 +
\right \| +
 +
$$
 +
 
 +
$$
 +
-  
 +
\frac{1}{p} _ {n} \left \|
 +
\begin{array}{cccc}
 +
0  &\dots  &\cdot  & 0  \\
 +
a _ {n}  &{}  &{}  &\cdot  \\
 +
\cdot  &{}  &{}  &\cdot  \\
 +
a _ {1}  &\dots  &a _ {n}  & 0  \\
 +
\end{array}
 +
\right \|
 +
\cdot \left \|
 +
\begin{array}{cccc}
 +
0  &b _ {n}  &\dots  &b _ {1}  \\
 +
\cdot  & 0  &{}  &\cdot  \\
 +
\cdot  &{}  &{}  &b _ {n}  \\
 +
0 & 0  &\dots  & 0  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
 
Here the vectors
 
Here the vectors
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244045.png" /></td> </tr></table>
+
$$
 +
 
 +
\frac{1}{p} _ {n} ( 1  b _ {1} \dots b _ {n} )  ^ {T} \ \
 +
\textrm{ and } \ \
 +
 
 +
\frac{1}{p} _ {n} ( a _ {n} \dots a _ {1}  1 )  ^ {T}
 +
$$
  
are, respectively, the first and last column of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244046.png" />. Thus, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244047.png" /> is completely described by giving its first and last columns. From (2) all the elements of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244048.png" /> can be successively computed:
+
are, respectively, the first and last column of $  T  ^ {-} 1 $.  
 +
Thus, $  T $
 +
is completely described by giving its first and last columns. From (2) all the elements of $  T  ^ {-} 1 $
 +
can be successively computed:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244049.png" /></td> </tr></table>
+
$$
 +
\{ T  ^ {-} 1 \} _ {i + 1 , j + 1 }  = \
 +
\{ T  ^ {-} 1 \} _ {i,j} +
  
This computation requires <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244050.png" /> arithmetical operations.
+
\frac{1}{p _ {n} }
 +
( b _ {i+} 1 a _ {j+} 1 - a _ {n-} i b _ {n-} j ) .
 +
$$
  
In economic algorithms for the inversion of Toeplitz matrices (see [[#References|[3]]], for example) the computation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244051.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244052.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244053.png" /> is performed by recurrence formulas and also requires <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244054.png" /> operations. The condition that the principal submatrices are non-singular can be relaxed, while still needing only <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244055.png" /> arithmetical operations.
+
This computation requires  $  O ( n  ^ {2} ) $
 +
arithmetical operations.
  
Matrix inversion is sometimes used in order to solve linear systems <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244056.png" /> by the formula <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244057.png" />. For matrices of general form such a procedure makes little sense, since it is accompanied by an increase in arithmetical work and loss of numerical stability as compared to direct solution of the linear system. But for Toeplitz (and related) matrices the situation is different. As the representation (2) shows, the computation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244058.png" /> reduces to the performance of four multiplications of Toeplitz matrices by vectors and a subtraction of vectors. There are economic algorithms for the multiplication of Toeplitz matrices by vectors, requiring (for order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244059.png" />) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244060.png" /> operations. Such asymptotic behaviour of the number of arithmetical operations is not yet attainable for the solution of Toeplitz systems (currently, the best methods for these require <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244061.png" /> operations). Hence, for the repeated solution of linear systems <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244062.png" /> with the same Toeplitz matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244063.png" /> and varying right-hand sides <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244064.png" />, a preliminary inversion of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244065.png" /> seems reasonable.
+
In economic algorithms for the inversion of Toeplitz matrices (see [[#References|[3]]], for example) the computation of  $  a _ {i} $,
 +
$  b _ {j} $
 +
and  $  p _ {n} $
 +
is performed by recurrence formulas and also requires  $  O ( n  ^ {2} ) $
 +
operations. The condition that the principal submatrices are non-singular can be relaxed, while still needing only  $  O ( n  ^ {2} ) $
 +
arithmetical operations.
 +
 
 +
Matrix inversion is sometimes used in order to solve linear systems $  A x = b $
 +
by the formula $  x = A  ^ {-} 1 b $.  
 +
For matrices of general form such a procedure makes little sense, since it is accompanied by an increase in arithmetical work and loss of numerical stability as compared to direct solution of the linear system. But for Toeplitz (and related) matrices the situation is different. As the representation (2) shows, the computation of $  T  ^ {-} 1 b $
 +
reduces to the performance of four multiplications of Toeplitz matrices by vectors and a subtraction of vectors. There are economic algorithms for the multiplication of Toeplitz matrices by vectors, requiring (for order $  n $)
 +
$  O ( n  \mathop{\rm log}  n ) $
 +
operations. Such asymptotic behaviour of the number of arithmetical operations is not yet attainable for the solution of Toeplitz systems (currently, the best methods for these require $  O ( n  \mathop{\rm log}  ^ {2}  n ) $
 +
operations). Hence, for the repeated solution of linear systems $  T x = b $
 +
with the same Toeplitz matrix $  T $
 +
and varying right-hand sides $  b $,  
 +
a preliminary inversion of $  T $
 +
seems reasonable.
  
 
Preliminary matrix inversion can be justified in the case of repeated solution of linear systems with the same matrix of general form on a computer with a large number of parallel processors. The reason is that, as compared to the multiplication of a matrix by a vector, direct methods for solving linear systems do not have such convenient parallelization.
 
Preliminary matrix inversion can be justified in the case of repeated solution of linear systems with the same matrix of general form on a computer with a large number of parallel processors. The reason is that, as compared to the multiplication of a matrix by a vector, direct methods for solving linear systems do not have such convenient parallelization.
  
In many cases (for example in the quasi-Newton methods of mathematical programming) it is required to invert a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244066.png" /> that differs from a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244067.png" /> with known inverse <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244068.png" /> by a matrix of rank 1 or (in the case of symmetric matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244069.png" />) by a symmetric matrix of rank 2. Such a reconstruction of an inverse matrix can be carried out in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244070.png" /> operations for matrices of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244071.png" />. The following formula may serve as an example (see [[#References|[4]]]): If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244072.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244073.png" /> are column vectors, then
+
In many cases (for example in the quasi-Newton methods of mathematical programming) it is required to invert a matrix $  A $
 +
that differs from a matrix $  B $
 +
with known inverse $  B  ^ {-} 1 $
 +
by a matrix of rank 1 or (in the case of symmetric matrix $  B $)  
 +
by a symmetric matrix of rank 2. Such a reconstruction of an inverse matrix can be carried out in $  O( n  ^ {2} ) $
 +
operations for matrices of order $  n $.  
 +
The following formula may serve as an example (see [[#References|[4]]]): If $  u $
 +
and $  v $
 +
are column vectors, then
 +
 
 +
$$
 +
( B + u v  ^ {T} )  ^ {-} 1  = \
 +
B  ^ {-} 1 -
 +
\frac{1} \gamma
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244074.png" /></td> </tr></table>
+
B  ^ {-} 1 u v  ^ {T} B  ^ {-} 1 ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244075.png" /> is assumed to be non-zero.
+
where $  \gamma = 1 + v  ^ {T} B  ^ {-} 1 u $
 +
is assumed to be non-zero.
  
From the point of view of the theory of computational complexity, the problem of matrix inversion has complexity of the same order (on a sequential machine) as the problem of solving a linear system (if certain natural conditions on the rate of growth of complexity of both problems as their order increases are satisfied [[#References|[5]]]). This complexity has order not exceeding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052440/i05244076.png" />.
+
From the point of view of the theory of computational complexity, the problem of matrix inversion has complexity of the same order (on a sequential machine) as the problem of solving a linear system (if certain natural conditions on the rate of growth of complexity of both problems as their order increases are satisfied [[#References|[5]]]). This complexity has order not exceeding $  n  ^ {0.49} $.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  V.V. Voevodin,  "Numerical methods of algebra" , Moscow  (1966)  (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  I.C. [I.Ts. Gokhberg] Gohberg,  I.A. Feld'man,  "Convolution equations and projection methods for their solution" , ''Transl. Math. Monogr.'' , '''41''' , Amer. Math. Soc.  (1974)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  W. Trench,  "An algorithm for the inversion of finite Toeplitz matrices"  ''Siam J. Control Optim.'' , '''12'''  (1964)  pp. 512–522</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.V. Aho,  J.E. Hopcroft,  J.D. Ullman,  "The design and analysis of computer algorithms" , Addison-Wesley  (1974)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  V.V. Voevodin,  "Numerical methods of algebra" , Moscow  (1966)  (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  I.C. [I.Ts. Gokhberg] Gohberg,  I.A. Feld'man,  "Convolution equations and projection methods for their solution" , ''Transl. Math. Monogr.'' , '''41''' , Amer. Math. Soc.  (1974)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  W. Trench,  "An algorithm for the inversion of finite Toeplitz matrices"  ''Siam J. Control Optim.'' , '''12'''  (1964)  pp. 512–522</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.V. Aho,  J.E. Hopcroft,  J.D. Ullman,  "The design and analysis of computer algorithms" , Addison-Wesley  (1974)</TD></TR></table>

Latest revision as of 22:13, 5 June 2020


An algorithm applicable for the numerical computation of an inverse matrix. As for the solution of linear systems, methods for numerical inversion can be subdivided into direct and iterative methods; however, iterative methods play a considerably smaller role here because of their laboriousness.

Most of the direct methods for the inversion of a matrix are based on the idea of decomposing the given matrix into easily invertible factors. If

$$ A = L _ {1} \dots L _ {k} $$

is such a decomposition, then

$$ A ^ {-} 1 = L _ {k} ^ {-} 1 \dots L _ {1} ^ {-} 1 . $$

The Jordan method (see [1]) is a typical representative of the direct methods (and one of the most generally used).

Let $ A $ be a non-singular matrix of order $ n $. The construction of the inverse matrix $ A ^ {-} 1 $ proceeds in $ n $ steps; the result of the $ k $- th step is a matrix $ A _ {k} $ whose first $ k $ columns are the same as the corresponding columns of the identity matrix. The passage from $ A _ {k} $( let $ A = A _ {0} $) to $ A _ {k+} 1 $ from a matrix point of view is equivalent to left multiplication of $ A $ by a matrix $ L _ {k+} 1 $ which differs from the identity matrix only in the $ ( k + 1 ) $- st column. The elements of this column are chosen in order to reduce the $ ( k + 1 ) $- st column of $ A _ {k} $ to the identity, and they have the form

$$ - \frac{a _ {1,k+} 1 ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } \dots - \frac{a _ {k , k + 1 } ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } , $$

$$ \frac{1}{a _ {k + 1 , k + 1 } ^ {(} k) } \dots - \frac{a _ {n , k + 1 } ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } . $$

From the relations

$$ A _ {k+} 1 = L _ {k+} 1 A _ {k} ,\ A _ {n} = E $$

it follows that

$$ L _ {n} \dots L _ {1} A = E $$

and

$$ \tag{1 } A ^ {-} 1 = L _ {n} \dots L _ {1} . $$

To obtain the factorization (1) for the inverse matrix $ A ^ {-} 1 $ requires about $ n ^ {3} /2 $ multiplications and about $ n ^ {3} /2 $ additions. Approximately the same number of operations is necessary to multiply out the matrices in (1) to obtain the explicit form for $ A ^ {-} 1 $. In many applications of matrix inversion the use of (1) is just as satisfactory as that of the explicit form. For example, the computation of the product $ A ^ {-} 1 b $, where $ b $ is a column vector, requires the same arithmetical work in both cases. The memory requirements when implemented on a computer are also the same.

In the above description of the Jordan method it was assumed for simplicity that all the elements $ a _ {k + 1 , k + 1 } ^ {(} k) $( called the pivots) were non-zero. In reality the Jordan method, like the methods of Gauss type for the solution of linear systems (cf. also Gauss method), is usually applied with one or another scheme for choosing the pivots. The use of such a scheme is equivalent to introducing additional factors in (1) which take account of the permutations of the rows and columns of the inverse matrix. The accuracy of the computed solution, as in the case of linear systems, depends on the rate of growth of the matrix entries in the intermediate steps of the method. Such growth, with subsequent deterioration of the accuracy in the computed solution, is more probable in the Jordan method, even with choice of pivots, than in methods of Gauss type.

The matrix $ R = E - A X $ is called the residual corresponding to the approximate inverse $ X $ for $ A $. One has the estimate

$$ \frac{\| A ^ {-} 1 - X \| }{\| A ^ {-} 1 \| } \leq \| R \| . $$

Thus, the norm of the residual is a bound for the relative precision of the approximate inverse matrix. This is an important difference between the problem of numerical inversion of a matrix and the solution of linear systems, where (for example, in orthogonal methods or methods of Gauss type) the residual is usually small, and the quality of the solution obtained depends on the conditioning of the system.

The inversion of several important classes of matrices can be achieved by methods that are significantly more economical than in the general case. Such classes include Toeplitz, Hankel, banded (and in particular tri-diagonal) matrices, block matrices having a Toeplitz structure or the structure of Kronecker products, etc. For example, let $ T $ be a Toeplitz matrix of order $ n + 1 $ with entries from $ \mathbf R $ or $ \mathbf C $:

$$ T = \ \left \| \begin{array}{cccc} t _ {0} &t _ {1} &\dots &t _ {n} \\ t _ {-} 1 &t _ {0} &\dots &t _ {n-} 1 \\ \cdot &\cdot &\dots &\cdot \\ t _ {-} n &t _ {-} n+ 1 &\dots &t _ {0} \\ \end{array} \right \| . $$

Assume that not only $ T $, but also its principal submatrices of order $ n $ are non-singular. Then the matrix $ T ^ {-} 1 $, which, in general, is already not Toeplitz, has the representation (see [2]):

$$ \tag{2 } T ^ {-} 1 = $$

$$ = \ \frac{1}{p} _ {n} \left \| \begin{array}{cccc} 1 & 0 &\cdot & 0 \\ b _ {1} & 1 &{} &\cdot \\ \cdot &{} &{} &\cdot \\ b _ {n} &\cdot &b _ {1} & 1 \\ \end{array} \right \| \cdot \left \| \begin{array}{cccc} 1 &a _ {1} &\cdot &a _ {n} \\ 0 & 1 &{} &\cdot \\ \cdot &{} &{} &a _ {1} \\ 0 &\cdot & 0 & 1 \\ \end{array} \right \| + $$

$$ - \frac{1}{p} _ {n} \left \| \begin{array}{cccc} 0 &\dots &\cdot & 0 \\ a _ {n} &{} &{} &\cdot \\ \cdot &{} &{} &\cdot \\ a _ {1} &\dots &a _ {n} & 0 \\ \end{array} \right \| \cdot \left \| \begin{array}{cccc} 0 &b _ {n} &\dots &b _ {1} \\ \cdot & 0 &{} &\cdot \\ \cdot &{} &{} &b _ {n} \\ 0 & 0 &\dots & 0 \\ \end{array} \right \| . $$

Here the vectors

$$ \frac{1}{p} _ {n} ( 1 b _ {1} \dots b _ {n} ) ^ {T} \ \ \textrm{ and } \ \ \frac{1}{p} _ {n} ( a _ {n} \dots a _ {1} 1 ) ^ {T} $$

are, respectively, the first and last column of $ T ^ {-} 1 $. Thus, $ T $ is completely described by giving its first and last columns. From (2) all the elements of $ T ^ {-} 1 $ can be successively computed:

$$ \{ T ^ {-} 1 \} _ {i + 1 , j + 1 } = \ \{ T ^ {-} 1 \} _ {i,j} + \frac{1}{p _ {n} } ( b _ {i+} 1 a _ {j+} 1 - a _ {n-} i b _ {n-} j ) . $$

This computation requires $ O ( n ^ {2} ) $ arithmetical operations.

In economic algorithms for the inversion of Toeplitz matrices (see [3], for example) the computation of $ a _ {i} $, $ b _ {j} $ and $ p _ {n} $ is performed by recurrence formulas and also requires $ O ( n ^ {2} ) $ operations. The condition that the principal submatrices are non-singular can be relaxed, while still needing only $ O ( n ^ {2} ) $ arithmetical operations.

Matrix inversion is sometimes used in order to solve linear systems $ A x = b $ by the formula $ x = A ^ {-} 1 b $. For matrices of general form such a procedure makes little sense, since it is accompanied by an increase in arithmetical work and loss of numerical stability as compared to direct solution of the linear system. But for Toeplitz (and related) matrices the situation is different. As the representation (2) shows, the computation of $ T ^ {-} 1 b $ reduces to the performance of four multiplications of Toeplitz matrices by vectors and a subtraction of vectors. There are economic algorithms for the multiplication of Toeplitz matrices by vectors, requiring (for order $ n $) $ O ( n \mathop{\rm log} n ) $ operations. Such asymptotic behaviour of the number of arithmetical operations is not yet attainable for the solution of Toeplitz systems (currently, the best methods for these require $ O ( n \mathop{\rm log} ^ {2} n ) $ operations). Hence, for the repeated solution of linear systems $ T x = b $ with the same Toeplitz matrix $ T $ and varying right-hand sides $ b $, a preliminary inversion of $ T $ seems reasonable.

Preliminary matrix inversion can be justified in the case of repeated solution of linear systems with the same matrix of general form on a computer with a large number of parallel processors. The reason is that, as compared to the multiplication of a matrix by a vector, direct methods for solving linear systems do not have such convenient parallelization.

In many cases (for example in the quasi-Newton methods of mathematical programming) it is required to invert a matrix $ A $ that differs from a matrix $ B $ with known inverse $ B ^ {-} 1 $ by a matrix of rank 1 or (in the case of symmetric matrix $ B $) by a symmetric matrix of rank 2. Such a reconstruction of an inverse matrix can be carried out in $ O( n ^ {2} ) $ operations for matrices of order $ n $. The following formula may serve as an example (see [4]): If $ u $ and $ v $ are column vectors, then

$$ ( B + u v ^ {T} ) ^ {-} 1 = \ B ^ {-} 1 - \frac{1} \gamma B ^ {-} 1 u v ^ {T} B ^ {-} 1 , $$

where $ \gamma = 1 + v ^ {T} B ^ {-} 1 u $ is assumed to be non-zero.

From the point of view of the theory of computational complexity, the problem of matrix inversion has complexity of the same order (on a sequential machine) as the problem of solving a linear system (if certain natural conditions on the rate of growth of complexity of both problems as their order increases are satisfied [5]). This complexity has order not exceeding $ n ^ {0.49} $.

References

[1] V.V. Voevodin, "Numerical methods of algebra" , Moscow (1966) (In Russian)
[2] I.C. [I.Ts. Gokhberg] Gohberg, I.A. Feld'man, "Convolution equations and projection methods for their solution" , Transl. Math. Monogr. , 41 , Amer. Math. Soc. (1974) (Translated from Russian)
[3] W. Trench, "An algorithm for the inversion of finite Toeplitz matrices" Siam J. Control Optim. , 12 (1964) pp. 512–522
[4] D.K. Faddeev, V.N. Faddeeva, "Computational methods of linear algebra" , Freeman (1963) (Translated from Russian)
[5] A.V. Aho, J.E. Hopcroft, J.D. Ullman, "The design and analysis of computer algorithms" , Addison-Wesley (1974)
How to Cite This Entry:
Inversion of a matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Inversion_of_a_matrix&oldid=47424
This article was adapted from an original article by Kh.D. Ikramov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article