Namespaces
Variants
Actions

Inversion of a matrix

From Encyclopedia of Mathematics
Jump to: navigation, search


An algorithm applicable for the numerical computation of an inverse matrix. As for the solution of linear systems, methods for numerical inversion can be subdivided into direct and iterative methods; however, iterative methods play a considerably smaller role here because of their laboriousness.

Most of the direct methods for the inversion of a matrix are based on the idea of decomposing the given matrix into easily invertible factors. If

$$ A = L _ {1} \dots L _ {k} $$

is such a decomposition, then

$$ A ^ {-} 1 = L _ {k} ^ {-} 1 \dots L _ {1} ^ {-} 1 . $$

The Jordan method (see [1]) is a typical representative of the direct methods (and one of the most generally used).

Let $ A $ be a non-singular matrix of order $ n $. The construction of the inverse matrix $ A ^ {-} 1 $ proceeds in $ n $ steps; the result of the $ k $- th step is a matrix $ A _ {k} $ whose first $ k $ columns are the same as the corresponding columns of the identity matrix. The passage from $ A _ {k} $( let $ A = A _ {0} $) to $ A _ {k+} 1 $ from a matrix point of view is equivalent to left multiplication of $ A $ by a matrix $ L _ {k+} 1 $ which differs from the identity matrix only in the $ ( k + 1 ) $- st column. The elements of this column are chosen in order to reduce the $ ( k + 1 ) $- st column of $ A _ {k} $ to the identity, and they have the form

$$ - \frac{a _ {1,k+} 1 ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } \dots - \frac{a _ {k , k + 1 } ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } , $$

$$ \frac{1}{a _ {k + 1 , k + 1 } ^ {(} k) } \dots - \frac{a _ {n , k + 1 } ^ {(} k) }{a _ {k + 1 , k + 1 } ^ {(} k) } . $$

From the relations

$$ A _ {k+} 1 = L _ {k+} 1 A _ {k} ,\ A _ {n} = E $$

it follows that

$$ L _ {n} \dots L _ {1} A = E $$

and

$$ \tag{1 } A ^ {-} 1 = L _ {n} \dots L _ {1} . $$

To obtain the factorization (1) for the inverse matrix $ A ^ {-} 1 $ requires about $ n ^ {3} /2 $ multiplications and about $ n ^ {3} /2 $ additions. Approximately the same number of operations is necessary to multiply out the matrices in (1) to obtain the explicit form for $ A ^ {-} 1 $. In many applications of matrix inversion the use of (1) is just as satisfactory as that of the explicit form. For example, the computation of the product $ A ^ {-} 1 b $, where $ b $ is a column vector, requires the same arithmetical work in both cases. The memory requirements when implemented on a computer are also the same.

In the above description of the Jordan method it was assumed for simplicity that all the elements $ a _ {k + 1 , k + 1 } ^ {(} k) $( called the pivots) were non-zero. In reality the Jordan method, like the methods of Gauss type for the solution of linear systems (cf. also Gauss method), is usually applied with one or another scheme for choosing the pivots. The use of such a scheme is equivalent to introducing additional factors in (1) which take account of the permutations of the rows and columns of the inverse matrix. The accuracy of the computed solution, as in the case of linear systems, depends on the rate of growth of the matrix entries in the intermediate steps of the method. Such growth, with subsequent deterioration of the accuracy in the computed solution, is more probable in the Jordan method, even with choice of pivots, than in methods of Gauss type.

The matrix $ R = E - A X $ is called the residual corresponding to the approximate inverse $ X $ for $ A $. One has the estimate

$$ \frac{\| A ^ {-} 1 - X \| }{\| A ^ {-} 1 \| } \leq \| R \| . $$

Thus, the norm of the residual is a bound for the relative precision of the approximate inverse matrix. This is an important difference between the problem of numerical inversion of a matrix and the solution of linear systems, where (for example, in orthogonal methods or methods of Gauss type) the residual is usually small, and the quality of the solution obtained depends on the conditioning of the system.

The inversion of several important classes of matrices can be achieved by methods that are significantly more economical than in the general case. Such classes include Toeplitz, Hankel, banded (and in particular tri-diagonal) matrices, block matrices having a Toeplitz structure or the structure of Kronecker products, etc. For example, let $ T $ be a Toeplitz matrix of order $ n + 1 $ with entries from $ \mathbf R $ or $ \mathbf C $:

$$ T = \ \left \| \begin{array}{cccc} t _ {0} &t _ {1} &\dots &t _ {n} \\ t _ {-} 1 &t _ {0} &\dots &t _ {n-} 1 \\ \cdot &\cdot &\dots &\cdot \\ t _ {-} n &t _ {-} n+ 1 &\dots &t _ {0} \\ \end{array} \right \| . $$

Assume that not only $ T $, but also its principal submatrices of order $ n $ are non-singular. Then the matrix $ T ^ {-} 1 $, which, in general, is already not Toeplitz, has the representation (see [2]):

$$ \tag{2 } T ^ {-} 1 = $$

$$ = \ \frac{1}{p} _ {n} \left \| \begin{array}{cccc} 1 & 0 &\cdot & 0 \\ b _ {1} & 1 &{} &\cdot \\ \cdot &{} &{} &\cdot \\ b _ {n} &\cdot &b _ {1} & 1 \\ \end{array} \right \| \cdot \left \| \begin{array}{cccc} 1 &a _ {1} &\cdot &a _ {n} \\ 0 & 1 &{} &\cdot \\ \cdot &{} &{} &a _ {1} \\ 0 &\cdot & 0 & 1 \\ \end{array} \right \| + $$

$$ - \frac{1}{p} _ {n} \left \| \begin{array}{cccc} 0 &\dots &\cdot & 0 \\ a _ {n} &{} &{} &\cdot \\ \cdot &{} &{} &\cdot \\ a _ {1} &\dots &a _ {n} & 0 \\ \end{array} \right \| \cdot \left \| \begin{array}{cccc} 0 &b _ {n} &\dots &b _ {1} \\ \cdot & 0 &{} &\cdot \\ \cdot &{} &{} &b _ {n} \\ 0 & 0 &\dots & 0 \\ \end{array} \right \| . $$

Here the vectors

$$ \frac{1}{p} _ {n} ( 1 b _ {1} \dots b _ {n} ) ^ {T} \ \ \textrm{ and } \ \ \frac{1}{p} _ {n} ( a _ {n} \dots a _ {1} 1 ) ^ {T} $$

are, respectively, the first and last column of $ T ^ {-} 1 $. Thus, $ T $ is completely described by giving its first and last columns. From (2) all the elements of $ T ^ {-} 1 $ can be successively computed:

$$ \{ T ^ {-} 1 \} _ {i + 1 , j + 1 } = \ \{ T ^ {-} 1 \} _ {i,j} + \frac{1}{p _ {n} } ( b _ {i+} 1 a _ {j+} 1 - a _ {n-} i b _ {n-} j ) . $$

This computation requires $ O ( n ^ {2} ) $ arithmetical operations.

In economic algorithms for the inversion of Toeplitz matrices (see [3], for example) the computation of $ a _ {i} $, $ b _ {j} $ and $ p _ {n} $ is performed by recurrence formulas and also requires $ O ( n ^ {2} ) $ operations. The condition that the principal submatrices are non-singular can be relaxed, while still needing only $ O ( n ^ {2} ) $ arithmetical operations.

Matrix inversion is sometimes used in order to solve linear systems $ A x = b $ by the formula $ x = A ^ {-} 1 b $. For matrices of general form such a procedure makes little sense, since it is accompanied by an increase in arithmetical work and loss of numerical stability as compared to direct solution of the linear system. But for Toeplitz (and related) matrices the situation is different. As the representation (2) shows, the computation of $ T ^ {-} 1 b $ reduces to the performance of four multiplications of Toeplitz matrices by vectors and a subtraction of vectors. There are economic algorithms for the multiplication of Toeplitz matrices by vectors, requiring (for order $ n $) $ O ( n \mathop{\rm log} n ) $ operations. Such asymptotic behaviour of the number of arithmetical operations is not yet attainable for the solution of Toeplitz systems (currently, the best methods for these require $ O ( n \mathop{\rm log} ^ {2} n ) $ operations). Hence, for the repeated solution of linear systems $ T x = b $ with the same Toeplitz matrix $ T $ and varying right-hand sides $ b $, a preliminary inversion of $ T $ seems reasonable.

Preliminary matrix inversion can be justified in the case of repeated solution of linear systems with the same matrix of general form on a computer with a large number of parallel processors. The reason is that, as compared to the multiplication of a matrix by a vector, direct methods for solving linear systems do not have such convenient parallelization.

In many cases (for example in the quasi-Newton methods of mathematical programming) it is required to invert a matrix $ A $ that differs from a matrix $ B $ with known inverse $ B ^ {-} 1 $ by a matrix of rank 1 or (in the case of symmetric matrix $ B $) by a symmetric matrix of rank 2. Such a reconstruction of an inverse matrix can be carried out in $ O( n ^ {2} ) $ operations for matrices of order $ n $. The following formula may serve as an example (see [4]): If $ u $ and $ v $ are column vectors, then

$$ ( B + u v ^ {T} ) ^ {-} 1 = \ B ^ {-} 1 - \frac{1} \gamma B ^ {-} 1 u v ^ {T} B ^ {-} 1 , $$

where $ \gamma = 1 + v ^ {T} B ^ {-} 1 u $ is assumed to be non-zero.

From the point of view of the theory of computational complexity, the problem of matrix inversion has complexity of the same order (on a sequential machine) as the problem of solving a linear system (if certain natural conditions on the rate of growth of complexity of both problems as their order increases are satisfied [5]). This complexity has order not exceeding $ n ^ {0.49} $.

References

[1] V.V. Voevodin, "Numerical methods of algebra" , Moscow (1966) (In Russian)
[2] I.C. [I.Ts. Gokhberg] Gohberg, I.A. Feld'man, "Convolution equations and projection methods for their solution" , Transl. Math. Monogr. , 41 , Amer. Math. Soc. (1974) (Translated from Russian)
[3] W. Trench, "An algorithm for the inversion of finite Toeplitz matrices" Siam J. Control Optim. , 12 (1964) pp. 512–522
[4] D.K. Faddeev, V.N. Faddeeva, "Computational methods of linear algebra" , Freeman (1963) (Translated from Russian)
[5] A.V. Aho, J.E. Hopcroft, J.D. Ullman, "The design and analysis of computer algorithms" , Addison-Wesley (1974)
How to Cite This Entry:
Inversion of a matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Inversion_of_a_matrix&oldid=47424
This article was adapted from an original article by Kh.D. Ikramov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article