Namespaces
Variants
Actions

Difference between revisions of "Displacement structure"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (link)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Many problems in engineering and applied mathematics ultimately require the solution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202301.png" /> linear systems of equations. For small-size problems, there is often not much else to do except to use one of the already standard methods of solution, such as Gaussian elimination (cf. also [[Gauss method|Gauss method]]). However, in many applications, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202302.png" /> can be very large (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202303.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202304.png" />) and, moreover, the linear equations may have to be solved over and over again, with different problem/model parameters, till a satisfactory solution to the original physical problem is obtained. In such cases, the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202306.png" /> burden, i.e. the number of flops required to solve an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202307.png" /> linear system of equations, can become prohibitively large. This is one reason why one seeks in various classes of applications to identify special/characteristic matrix structures that may be assumed in order to reduce the computational burden.
+
<!--This article has been texified automatically. Since there was no Nroff source code for this article,  
 +
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
 +
was used.
 +
If the TeX and formula formatting is correct and if all png images have been replaced by TeX code, please remove this message and the {{TEX|semi-auto}} category.
  
The most obvious structures are those that involve explicit patterns among the matrix entries, such as Toeplitz, Hankel, Vandermonde, Cauchy, and Pick matrices. Several fast algorithms have been devised over the years to exploit these special structures. However, even more common than these explicit matrix structures are matrices in which the structure is implicit. For example, in certain least-squares problems one often encounters products of Toeplitz matrices; these products are not generally Toeplitz, but, on the other hand, they are not  "unstructured" . Similarly, in probabilistic calculations the matrix of interest is often not a Toeplitz matrix, but rather its inverse, which is rarely Toeplitz itself, but of course is not unstructured: its inverse is Toeplitz. It is well-known that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202308.png" /> flops suffice to solve linear systems of equations with an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202309.png" /> Toeplitz coefficient matrix; a question is whether one will need <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023010.png" /> flops to invert a non-Toeplitz coefficient matrix whose inverse is known to be Toeplitz? When pressed, one's response clearly must be that it is conceivable that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023011.png" /> flops will suffice, and this is in fact true.
+
Out of 187 formulas, 180 were replaced by TEX code.-->
 +
 
 +
{{TEX|semi-auto}}{{TEX|part}}
 +
Many problems in engineering and applied mathematics ultimately require the solution of $n \times n$ linear systems of equations. For small-size problems, there is often not much else to do except to use one of the already standard methods of solution, such as Gaussian elimination (cf. also [[Gauss method|Gauss method]]). However, in many applications, $n$ can be very large (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202303.png"/>, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d1202304.png"/>) and, moreover, the linear equations may have to be solved over and over again, with different problem/model parameters, till a satisfactory solution to the original physical problem is obtained. In such cases, the $O ( n ^ { 3 } )$ burden, i.e. the number of flops required to solve an $n \times n$ linear system of equations, can become prohibitively large. This is one reason why one seeks in various classes of applications to identify special/characteristic matrix structures that may be assumed in order to reduce the computational burden.
 +
 
 +
The most obvious structures are those that involve explicit patterns among the matrix entries, such as Toeplitz, Hankel, Vandermonde, Cauchy, and Pick matrices. Several fast algorithms have been devised over the years to exploit these special structures. However, even more common than these explicit matrix structures are matrices in which the structure is implicit. For example, in certain least-squares problems one often encounters products of Toeplitz matrices; these products are not generally Toeplitz, but, on the other hand, they are not  "unstructured" . Similarly, in probabilistic calculations the matrix of interest is often not a Toeplitz matrix, but rather its inverse, which is rarely Toeplitz itself, but of course is not unstructured: its inverse is Toeplitz. It is well-known that $O ( n ^ { 2 } )$ flops suffice to solve linear systems of equations with an $n \times n$ Toeplitz coefficient matrix; a question is whether one will need $O ( n ^ { 3 } )$ flops to invert a non-Toeplitz coefficient matrix whose inverse is known to be Toeplitz? When pressed, one's response clearly must be that it is conceivable that $O ( n ^ { 2 } )$ flops will suffice, and this is in fact true.
  
 
Such problems suggest the need for a quantitative way of defining and identifying structure in matrices. Over the years, starting with [[#References|[a1]]], it was found that an elegant and useful way to do so is the concept of displacement structure. This concept has also been useful for a host of problems apparently far removed from the solution of linear equations, such as the study of constrained and unconstrained rational interpolation, maximum entropy extension, signal detection, digital filter design, non-linear Riccati differential equations, inverse scattering, certain Fredholm integral equations, etc. (see [[#References|[a3]]], [[#References|[a2]]] and the many references therein).
 
Such problems suggest the need for a quantitative way of defining and identifying structure in matrices. Over the years, starting with [[#References|[a1]]], it was found that an elegant and useful way to do so is the concept of displacement structure. This concept has also been useful for a host of problems apparently far removed from the solution of linear equations, such as the study of constrained and unconstrained rational interpolation, maximum entropy extension, signal detection, digital filter design, non-linear Riccati differential equations, inverse scattering, certain Fredholm integral equations, etc. (see [[#References|[a3]]], [[#References|[a2]]] and the many references therein).
  
For motivation, consider an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023012.png" /> Hermitian [[Toeplitz matrix|Toeplitz matrix]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023013.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023014.png" />. Since such matrices are completely specified by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023015.png" /> entries, rather than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023016.png" />, one would of course expect a reduction in computational effort for handling problems involving such matrices. However, exploiting the Toeplitz structure is apparently more difficult than it may at first seem. To see this, consider the simple case of a real symmetric <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023017.png" /> Toeplitz matrix and apply the first step of the Gaussian elimination procedure to it, namely
+
For motivation, consider an $n \times n$ Hermitian [[Toeplitz matrix|Toeplitz matrix]] $T = ( c _ { i  - j} ) _ { i , j=0 } ^ { n - 1 } $, $c _ { i } = c _ { - i } ^ { * }$. Since such matrices are completely specified by $n$ entries, rather than $n ^ { 2 }$, one would of course expect a reduction in computational effort for handling problems involving such matrices. However, exploiting the Toeplitz structure is apparently more difficult than it may at first seem. To see this, consider the simple case of a real symmetric $3 \times 3$ Toeplitz matrix and apply the first step of the Gaussian elimination procedure to it, namely
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023018.png" /></td> </tr></table>
+
<table class="eq" style="width:100%;"> <tr><td style="width:94%;text-align:center;" valign="top"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023018.png"/></td> </tr></table>
  
where the so-called Schur complement matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023019.png" /> is seen to be
+
where the so-called [[Schur complement]] matrix $\Delta$ is seen to be
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023020.png" /></td> </tr></table>
+
\begin{equation*} \Delta = \frac { 1 } { c_0 } \left( \begin{array} { c c c } { c_0^ { 2 } - c_ { 1 } ^ { 2 } } &amp; { \square } &amp; {  c  _ { 1 } c_{0} - c _ { 1 }  c _ { 2 } } \\ {  c _ { 1 } c  _ { 0 } - c  _ { 1 } c  _ { 2 } } &amp; { \square } &amp; { c  _ { 0 } ^ { 2 } -  c  _ { 2 } ^ { 2 } } \end{array} \right). \end{equation*}
  
However, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023021.png" /> is no longer Toeplitz, so the special structure is lost in the very first step of the procedure. The fact is that what is preserved is not the Toeplitz structure, but a deeper notion called  "displacement structure" .
+
However, $\Delta$ is no longer Toeplitz, so the special structure is lost in the very first step of the procedure. The fact is that what is preserved is not the Toeplitz structure, but a deeper notion called  "displacement structure" .
  
There are several forms of displacement structure, the earliest of which is the following [[#References|[a1]]]. Consider an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023022.png" /> [[Hermitian matrix|Hermitian matrix]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023023.png" /> and the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023024.png" /> lower-triangular shift matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023025.png" /> with ones on the first subdiagonal and zeros elsewhere (i.e., a lower-triangular Jordan block with eigenvalues at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023026.png" />). The displacement of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023027.png" /> with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023028.png" />, denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023029.png" />, is defined as the difference
+
There are several forms of displacement structure, the earliest of which is the following [[#References|[a1]]]. Consider an $n \times n$ [[Hermitian matrix|Hermitian matrix]] $R$ and the $n \times n$ lower-triangular shift matrix $Z$ with ones on the first subdiagonal and zeros elsewhere (i.e., a lower-triangular Jordan block with eigenvalues at $0$). The displacement of $R$ with respect to $Z$, denoted by $\nabla _ { Z } R$, is defined as the difference
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023030.png" /></td> </tr></table>
+
\begin{equation*} \nabla _ { Z } R = R - Z R Z ^ { * }. \end{equation*}
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023031.png" /> is said to have displacement structure (or to be of low displacement rank) with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023032.png" /> if the rank of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023033.png" /> is considerably lower than (and independent of) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023034.png" />. For example, a Hermitian Toeplitz matrix has displacement rank <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023035.png" /> with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023036.png" />.
+
The matrix $R$ is said to have displacement structure (or to be of low displacement rank) with respect to $Z$ if the rank of $\nabla _ { Z } R$ is considerably lower than (and independent of) $n$. For example, a Hermitian Toeplitz matrix has displacement rank $2$ with respect to $Z$.
  
More generally, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023037.png" /> denote the rank of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023038.png" />. Then one can write <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023039.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023040.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023041.png" /> is an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023042.png" />-matrix and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023043.png" /> is a  "signature"  matrix of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023044.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023045.png" />. This representation is highly non-unique, since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023046.png" /> can be replaced by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023047.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023049.png" />-unitary matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023050.png" />, i.e. for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023051.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023052.png" />; this flexibility is actually very useful. The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023053.png" /> is said to be a generator matrix for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023054.png" /> since, along with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023055.png" />, it completely identifies <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023056.png" />. If one labels the columns of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023057.png" /> as
+
More generally, let $r \ll n$ denote the rank of $\nabla _ { Z } R$. Then one can write $\nabla _ { Z } R$ as $\nabla _ { Z } R = G J G ^ { * }$, where $G$ is an $( n \times r )$-matrix and $J$ is a  "signature"  matrix of the form $J = ( I _ { p } \oplus - l _ { q } )$ with $p + q = r$. This representation is highly non-unique, since $G$ can be replaced by $G \Theta$ for any $J$-unitary matrix $\Theta$, i.e. for any $\Theta$ such that $\Theta J \Theta ^ { * } = J$; this flexibility is actually very useful. The matrix $G$ is said to be a generator matrix for $R$ since, along with $\{ Z , J \}$, it completely identifies $R$. If one labels the columns of $G$ as
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023058.png" /></td> </tr></table>
+
\begin{equation*} G = \left( \begin{array} { c c c c c c c } { x _ { 0 } } &amp; { \square \ldots \square} &amp; { x _ { p - 1 } } &amp; { y _ { 0 } } &amp; { \square \ldots  \square } &amp; { y _ { q - 1 } } \end{array} \right), \end{equation*}
  
and lets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023059.png" /> denote a lower-triangular Toeplitz matrix whose first column is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023060.png" />, then it can be seen that the unique <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023061.png" /> that solves the displacement equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023062.png" /> is given by
+
and lets $L ( x )$ denote a lower-triangular Toeplitz matrix whose first column is $x$, then it can be seen that the unique $R$ that solves the displacement equation $\nabla _ { Z } R = G J G ^ { * }$ is given by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023063.png" /></td> </tr></table>
+
\begin{equation*} R = \sum _ { i = 0 } ^ { n - 1 } Z ^ { i } G J G ^ { * } Z ^ { * i } = \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023064.png" /></td> </tr></table>
+
\begin{equation*} = \sum _ { i = 0 } ^ { p - 1 } L ( x _ { i } ) L ^ { * } ( x _ { i } ) - \sum _ { i = 0 } ^ { q - 1 } L ( y _ { i } ) L ^ { * } ( y _ { i } ). \end{equation*}
  
Such displacement representations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023065.png" /> as a combination of products of lower- and upper-triangular Toeplitz matrices allow, for example, bilinear forms such as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023066.png" /> to be rapidly evaluated via convolutions (and hence fast Fourier transforms).
+
Such displacement representations of $R$ as a combination of products of lower- and upper-triangular Toeplitz matrices allow, for example, bilinear forms such as $x ^ { * } R y$ to be rapidly evaluated via convolutions (and hence fast Fourier transforms).
  
As mentioned above, a general Toeplitz matrix has displacement rank <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023067.png" />, with in fact <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023068.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023069.png" />. But there are interesting non-Toeplitz matrices with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023070.png" />, for example the inverse of a Toeplitz matrix. In fact, this is a special case of the following fundamental result: If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023071.png" /> is invertible and satisfies <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023072.png" />, then there exists a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023073.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023074.png" />. The fact that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023075.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023076.png" /> are interchanged in the latter formula suggests that one can define a so-called  "natural"  inverse, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023077.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023078.png" /> is the  "reverse"  identity matrix (with ones on the anti-diagonal). For then one sees (with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023079.png" />) that
+
As mentioned above, a general Toeplitz matrix has displacement rank $2$, with in fact $p = 1$ and $q = 1$. But there are interesting non-Toeplitz matrices with $p = 1 = q$, for example the inverse of a Toeplitz matrix. In fact, this is a special case of the following fundamental result: If $R$ is invertible and satisfies $R - Z R Z ^ { * } = G J G ^ { * }$, then there exists a $\widetilde { H }$ such that $R ^ { - 1 } - Z ^ { * } R ^ { - 1 } Z = \widetilde{ H } \square ^ { * } J \widetilde { H }$. The fact that $Z$ and $ Z ^ { * }$ are interchanged in the latter formula suggests that one can define a so-called  "natural"  inverse, $R ^ { - \# } = \tilde{I} R ^ { - 1 } \tilde{I}$, where $\tilde{I}$ is the  "reverse"  identity matrix (with ones on the anti-diagonal). For then one sees (with $H = \tilde{I} \tilde { H } \square ^{*}$) that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023080.png" /></td> </tr></table>
+
\begin{equation*} R ^ { - \# } - Z R ^ { - \# } Z ^ { * } = H J H ^ { * }. \end{equation*}
  
Therefore, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023081.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023082.png" /> have the same displacement rank and (when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023083.png" /> is Hermitian) the same displacement inertia (since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023084.png" /> is the same). (A real symmetric Toeplitz matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023085.png" /> has <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023086.png" />, so that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023087.png" /> has a representation of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023088.png" />, which after suitably identifying <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023089.png" /> is a special so-called Gohberg–Semencul formula.) The proof of the above fundamental result is very simple (see, e.g., [[#References|[a3]]]) and in fact holds with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023090.png" /> replaced by any matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023091.png" />.
+
Therefore, $R$ and $R ^ { - \# }$ have the same displacement rank and (when $R$ is Hermitian) the same displacement inertia (since $J$ is the same). (A real symmetric Toeplitz matrix $T$ has $T ^ { - 1 } = T ^ { - \# }$, so that $T ^ { - 1 }$ has a representation of the form $T ^ { - 1 } = L ( x ) L ^ { * } ( x ) - L ( y ) L ^ { * } ( y )$, which after suitably identifying $\{ x , y \}$ is a special so-called [[Gohberg–Semencul formula]].) The proof of the above fundamental result is very simple (see, e.g., [[#References|[a3]]]) and in fact holds with $Z$ replaced by any matrix $F$.
  
An interesting example is obtained by choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023092.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023093.png" />, when the solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023094.png" /> of the displacement equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023095.png" /> becomes a  "Pick"  matrix of the form
+
An interesting example is obtained by choosing $F = \operatorname { diag } \{ f _ { 0 } , \dots , f _ { n - 1 } \}$, $| f _ { i } | &lt; 1$, when the solution $R$ of the displacement equation $R - F R F ^ { * } = G J G ^ { * }$ becomes a  "Pick"  matrix of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023096.png" /></td> </tr></table>
+
\begin{equation*} P = \left( \frac { u _ { i } u _ { j } ^ { * } - v _ { i } v _ { j } ^ { * } } { 1 - f _ { i } f _ { j } ^ { * } } \right) _ { i , j = 0 } ^ { n - 1 }, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023097.png" /> are <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023098.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d12023099.png" /> vectors. Pick matrices arise in solving analytic interpolation problems and the displacement theory gives new and efficient computational algorithms for solving a variety of such problems (see, e.g., [[#References|[a2]]] and [[Nevanlinna–Pick interpolation|Nevanlinna–Pick interpolation]]).
+
where $\{ u_ { i } , v _ { i } \}$ are $1 \times p$ and $1 \times q$ vectors. Pick matrices arise in solving analytic interpolation problems and the displacement theory gives new and efficient computational algorithms for solving a variety of such problems (see, e.g., [[#References|[a2]]] and [[Nevanlinna–Pick interpolation|Nevanlinna–Pick interpolation]]).
  
One can handle non-Hermitian structured matrices by using displacement operators of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230100.png" />. When <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230101.png" /> is diagonal with distinct entries and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230102.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230103.png" /> becomes a Vandermonde matrix. A closely related displacement operator, first introduced in [[#References|[a4]]], has the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230104.png" />. Choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230105.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230106.png" /> leads to Cauchy-like solutions of the form
+
One can handle non-Hermitian structured matrices by using displacement operators of the form $\nabla _ { F, A} R = R - F R A ^ { * }$. When $F$ is diagonal with distinct entries and $A = Z$, $R$ becomes a Vandermonde matrix. A closely related displacement operator, first introduced in [[#References|[a4]]], has the form $F R - R A ^ { * }$. Choosing $F = \operatorname { diag } \{ f _ { i } \}$, $A = \operatorname { diag } \{ a _ { i } \}$ leads to Cauchy-like solutions of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230107.png" /></td> </tr></table>
+
\begin{equation*} C _ { l } = \left( \frac { u _ { i } v _ { j } ^ { * } } { f _ { i } - a _ { j } ^ { * } } \right) , u _ { i } , v _ { i } \in \mathcal C ^ { 1 \times r }. \end{equation*}
  
The name comes from the fact that when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230108.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230109.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230110.png" /> is a [[Cauchy matrix|Cauchy matrix]]. When <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230111.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230112.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230113.png" />, one gets the so-called Loewner matrix.
+
The name comes from the fact that when $r = 1$ and $u _ { 0 } = 1 = v _ { 0 }$, $C_l$ is a [[Cauchy matrix|Cauchy matrix]]. When $r = 2$, and $u _ { i } = \left( \beta _ { i } \quad 1 \right) $ and $v _ { i } = ( 1 - k _ { i } )$, one gets the so-called Loewner matrix.
  
 
It is often convenient to use generating-function language, and to define
 
It is often convenient to use generating-function language, and to define
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230114.png" /></td> </tr></table>
+
\begin{equation*} R ( z , w ) = \sum _ { i , j = 0 } ^ { \infty } R _ { ij } z ^ { i } w ^ { * j }. \end{equation*}
  
 
(Finite structured matrices can be extended to be semi-infinite in many natural ways, e.g. by extending the generator matrix by adding additional rows.) In this language, one can introduce general displacement equations of the form
 
(Finite structured matrices can be extended to be semi-infinite in many natural ways, e.g. by extending the generator matrix by adding additional rows.) In this language, one can introduce general displacement equations of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230115.png" /></td> </tr></table>
+
\begin{equation*} d ( z , w ) R ( z , w ) = G ( z ) J G ^ { * } ( w ), \end{equation*}
  
 
with
 
with
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230116.png" /></td> </tr></table>
+
\begin{equation*} d ( z , w ) = \sum _ { i , j = 0 } ^ { \infty } d _ { i j } z ^ { i } w ^ { * j }. \end{equation*}
  
Choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230117.png" /> corresponds to matrix displacement equations of the form (in an obvious notation), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230118.png" />, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230119.png" /> corresponds to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230120.png" />. There are many other useful choices, but to enable recursive matrix factorizations it is necessary to restrict <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230121.png" /> to the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230122.png" />. Note, in particular, that for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230123.png" /> one gets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230124.png" />, the Szegö kernel. Other choices lead to the various so-called de Branges and Bergman kernels. See [[#References|[a5]]] and the references therein.
+
Choosing $d ( z , w ) = 1 - z w ^ { * }$ corresponds to matrix displacement equations of the form (in an obvious notation), $R - Z R Z ^ { * } = G J G ^ { * }$, while $d ( z , w ) = ( z - w ^ { * } )$ corresponds to $Z R - R Z ^ { * } = G J G ^ { * }$. There are many other useful choices, but to enable recursive matrix factorizations it is necessary to restrict $d ( z , w )$ to the form $d ( z , w ) = \alpha ( z ) \alpha ^ { * } ( w ) - \beta ( z ) \beta ^ { * } ( w )$. Note, in particular, that for $R = I$ one gets $R ( z , w ) = 1 / ( 1 - z w ^ { * } )$, the Szegö kernel. Other choices lead to the various so-called de Branges and Bergman kernels. See [[#References|[a5]]] and the references therein.
  
A central feature in many of the applications mentioned earlier is the ability to efficiently obtain the so-called triangular LDU-factorization of a matrix and of its inverse (cf. also [[Matrix factorization|Matrix factorization]]). Among purely matrix computations, one may mention that this enables fast determination of the so-called QR-decomposition and the factorization of composite matrices such as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230125.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230126.png" /> being Toeplitz. The LDU-factorization of structured matrices is facilitated by the fact that Schur complements inherit displacement structure. For example, writing
+
A central feature in many of the applications mentioned earlier is the ability to efficiently obtain the so-called triangular LDU-factorization of a matrix and of its inverse (cf. also [[Matrix factorization|Matrix factorization]]). Among purely matrix computations, one may mention that this enables fast determination of the so-called QR-decomposition and the factorization of composite matrices such as $T _ { 1 } T _ { 2 } ^ { - 1 } T _ { 3 }$, $T _ { i }$ being Toeplitz. The LDU-factorization of structured matrices is facilitated by the fact that Schur complements inherit displacement structure. For example, writing
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230127.png" /></td> </tr></table>
+
\begin{equation*} R = \left( \begin{array} { l l } { R _ { 11 } } &amp; { R _ { 12 } } \\ { R _ { 21 } } &amp; { R _ { 22 } } \end{array} \right), F = \left( \begin{array} { l l } { F _ { 1 } } &amp; { 0 } \\ { F _ { 2 } } &amp; { F _ { 3 } } \end{array} \right), \end{equation*}
  
 
it turns out that
 
it turns out that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230128.png" /></td> </tr></table>
+
\begin{equation*} \operatorname { rank } ( S - F _ { 3 } S F _ { 3 } ^ { * } ) \leq \operatorname { rank } ( R - F R F ^ { * } ), \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230129.png" />. By properly defining the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230130.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230131.png" />, this allows one to find the displacement structure of various composite matrices. For example, choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230132.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230133.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230134.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230135.png" />, gives the previously mentioned result on inverses of structured matrices.
+
where $S = R _ { 22 } - R _ { 21 } R _ { 11 } ^ { - 1 } R _ { 12 }$. By properly defining the $\{ R_{ij} \}$ and $\{ F _ { i } \}$, this allows one to find the displacement structure of various composite matrices. For example, choosing $R _ { 11 } = - T$, $R _ { 12 } = I = R _ { 21 }$, $R _ { 22 } = 0$, and $F = Z \oplus Z$, gives the previously mentioned result on inverses of structured matrices.
  
Computations on structured matrices are made efficient by working not with the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230136.png" /> entries of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230137.png" />, but with the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230138.png" /> entries of the generator matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230139.png" />. The basic triangular factorization algorithm is Gaussian elimination, which, as noted in the first calculation, amounts to finding a sequence of Schur complements. Incorporating structure into the Gaussian elimination procedure was, in retrospect, first done (in the special case <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230140.png" />) by I. Schur himself in a remarkable 1917 paper dealing with the apparently very different problem of checking when a power series is bounded in the unit disc.
+
Computations on structured matrices are made efficient by working not with the $n ^ { 2 }$ entries of $R$, but with the $n r$ entries of the generator matrix $G$. The basic triangular factorization algorithm is Gaussian elimination, which, as noted in the first calculation, amounts to finding a sequence of Schur complements. Incorporating structure into the Gaussian elimination procedure was, in retrospect, first done (in the special case $r = 2$) by I. Schur himself in a remarkable 1917 paper dealing with the apparently very different problem of checking when a power series is bounded in the unit disc.
  
The algorithm below is but one generalization of Schur's original recursion. It provides an efficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230141.png" /> procedure for the computation of the triangular factors of a Hermitian positive-definite matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230142.png" /> satisfying
+
The algorithm below is but one generalization of Schur's original recursion. It provides an efficient $O ( m ^ { 2 } )$ procedure for the computation of the triangular factors of a Hermitian positive-definite matrix $R$ satisfying
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230143.png" /></td> </tr></table>
+
\begin{equation*} R - Z R Z ^ { * } = G J G ^ { * } , G \in \mathcal{C} ^ { n \times r }, \end{equation*}
  
with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230144.png" />. So, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230145.png" /> denote the triangular decomposition of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230146.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230147.png" />, and the lower triangular factor <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230148.png" /> is normalized in such a way that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230149.png" /> appear on its main diagonal. The non-zero part of the consecutive columns of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230150.png" /> will be further denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230151.png" />. Then it holds that the successive Schur complements of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230152.png" /> with respect to its leading <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230153.png" />-submatrices, denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230154.png" />, are also structured and satisfy
+
with $J = ( I _ { p } \oplus - l _ { q } )$. So, let $R = L D ^ { - 1 } L ^ { * }$ denote the triangular decomposition of $R$, where $D = \operatorname { diag } \{ d _ { 0 } , \dots , d _ { n - 1 } \}$, and the lower triangular factor $L$ is normalized in such a way that the $\{ d _ { i } \}$ appear on its main diagonal. The non-zero part of the consecutive columns of $L$ will be further denoted by $l_i$. Then it holds that the successive Schur complements of $R$ with respect to its leading $( i \times i )$-submatrices, denoted by $R_i$, are also structured and satisfy
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230155.png" /></td> </tr></table>
+
\begin{equation*} R _ { i } - Z _ { i } R _ { i } Z _ { i } ^ { * } = G _ { i } J G _ { i } ^ { * }, \end{equation*}
  
with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230156.png" /> lower-triangular shift matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230157.png" />, and where the generator matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230158.png" /> are obtained by the following recursive construction:
+
with $( n - i ) \times ( n - i )$ lower-triangular shift matrices $Z_{i}$, and where the generator matrices $G_i$ are obtained by the following recursive construction:
  
start with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230159.png" />;
+
start with $G _ { 0 } = G$;
  
repeat for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230160.png" />:
+
repeat for $i \geq 0$:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230161.png" /></td> </tr></table>
+
\begin{equation*} \left( \begin{array} { c } { 0 } \\ { G _ { i + 1 } } \end{array} \right) = \left\{ G _ { i } + Z _ { i } G _ { i } \frac { J g _ { i } ^ { * } g _ { i } } { g _ { i } J g _ { i } ^ { * } } \right\} \Theta _ { i }, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230162.png" /> is an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230163.png" />-unitary matrix, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230164.png" /> denotes the top row of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230165.png" />. Then
+
where $\Theta_i$ is an arbitrary $J$-unitary matrix, and $g_i$ denotes the top row of $G_i$. Then
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230166.png" /></td> </tr></table>
+
<table class="eq" style="width:100%;"> <tr><td style="width:94%;text-align:center;" valign="top"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230166.png"/></td> </tr></table>
  
The degree of freedom in choosing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230167.png" /> is often very useful. One particular choice leads to the so-called proper form of the generator recursion. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230168.png" /> reduce <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230169.png" /> to the form
+
The degree of freedom in choosing $\Theta_i$ is often very useful. One particular choice leads to the so-called proper form of the generator recursion. Let $\Theta_i$ reduce $g_i$ to the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230170.png" /></td> </tr></table>
+
\begin{equation*} g _ { i }  \Theta _ { i } = \left( \begin{array} { l l l } { \delta _ { i } } &amp; { 0 } &amp; { \ldots } &amp; { 0 } \end{array} \right), \end{equation*}
  
 
with a non-zero scalar entry in the leading position. Then
 
with a non-zero scalar entry in the leading position. Then
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230171.png" /></td> </tr></table>
+
\begin{equation*} \left( \begin{array} { c } { 0 } \\ { G _ { i + 1 } } \end{array} \right) = Z _ { i } G _ { i } \Theta _ { i } \left( \begin{array} { c c } { 1 } &amp; { 0 } \\ { 0 } &amp; { 0 } \end{array} \right) + G _ { i } \Theta _ { i } \left( \begin{array} { c c } { 0 } &amp; { 0 } \\ { 0 } &amp; { I _ { p + q - 1 } } \end{array} \right), \end{equation*}
  
 
with
 
with
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230172.png" /></td> </tr></table>
+
\begin{equation*} l _ { i } = \delta _ { i } ^ { * } G _ { i } \Theta _ { i } \left( \begin{array} { c } { 1 } \\ { 0 } \end{array} \right) , d _ { i } = | \delta _ { i } | ^ { 2 }. \end{equation*}
  
In words, this shows that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230173.png" /> can be obtained from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230174.png" /> as follows:
+
In words, this shows that $G _ { i+1 } $ can be obtained from $G_i$ as follows:
  
1) reduce <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230175.png" /> to proper form;
+
1) reduce $g_i$ to proper form;
  
2) multiply <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230176.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230177.png" /> and keep the last columns of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230178.png" /> unaltered;
+
2) multiply $G_i$ by $\Theta_i$ and keep the last columns of $ { G } _ { i } \Theta _ { i }$ unaltered;
  
3) shift down the first column of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230179.png" /> by one position.
+
3) shift down the first column of $ { G } _ { i } \Theta _ { i }$ by one position.
  
Extensions of the algorithm to more general structured matrices are possible (see, e.g., [[#References|[a2]]], [[#References|[a3]]]). In addition, the algorithm can be extended to provide simultaneous factorizations of both a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230180.png" /> and its inverse, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230181.png" />; see, e.g., [[#References|[a3]]].
+
Extensions of the algorithm to more general structured matrices are possible (see, e.g., [[#References|[a2]]], [[#References|[a3]]]). In addition, the algorithm can be extended to provide simultaneous factorizations of both a matrix $R$ and its inverse, $R ^ { - 1 }$; see, e.g., [[#References|[a3]]].
  
An issue that arises in the study of such fast factorization algorithms is their numerical stability in finite-precision implementations. It was mentioned earlier that the generalized Schur algorithm amounts to combining Gaussian elimination with structure, and it is well-known that Gaussian elimination in its purest form is numerically unstable (meaning that the error in the factorization <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230182.png" /> can be quite large, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230183.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230184.png" /> denote the computed <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230185.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230186.png" />, respectively). The instability can often be controlled by resorting to pivoting techniques, i.e., by permuting the order of the rows, and perhaps columns, of the matrices before the Gaussian elimination steps. However, pivoting can destroy matrix structure and can thus lead to a loss in computational efficiency. It was observed in [[#References|[a6]]], though, that for diagonal displacement operators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230187.png" />, and more specifically for Cauchy-like matrices, partial pivoting does not destroy matrix structure. This fact was exploited in [[#References|[a7]]] in the context of the generalized Schur algorithm and applied to other structured matrices. This was achieved by showing how to transform different kinds of matrix structure into Cauchy-like structure.
+
An issue that arises in the study of such fast factorization algorithms is their numerical stability in finite-precision implementations. It was mentioned earlier that the generalized Schur algorithm amounts to combining Gaussian elimination with structure, and it is well-known that Gaussian elimination in its purest form is numerically unstable (meaning that the error in the factorization <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/d/d120/d120230/d120230182.png"/> can be quite large, where $\hat{L}$ and $\hat{D}$ denote the computed $L$ and $D$, respectively). The instability can often be controlled by resorting to pivoting techniques, i.e., by permuting the order of the rows, and perhaps columns, of the matrices before the Gaussian elimination steps. However, pivoting can destroy matrix structure and can thus lead to a loss in computational efficiency. It was observed in [[#References|[a6]]], though, that for diagonal displacement operators $F$, and more specifically for Cauchy-like matrices, partial pivoting does not destroy matrix structure. This fact was exploited in [[#References|[a7]]] in the context of the generalized Schur algorithm and applied to other structured matrices. This was achieved by showing how to transform different kinds of matrix structure into Cauchy-like structure.
  
 
Partial pivoting by itself is not sufficient to guarantee numerical stability even for slow algorithms. Moreover, the above transformation-and-pivoting technique is only efficient for fixed-order problems, since the transformations have to be repeated afresh whenever the order changes. Another approach to the numerical stability of the generalized Schur algorithm is to examine the steps of the algorithm directly and to stabilize them without resorting to transformations among matrix structures. This was done in [[#References|[a8]]], where it was shown, in particular, how to implement the hyperbolic rotations in a reliable manner. For all practical purposes, the main conclusion of [[#References|[a8]]] is that the generalized Schur algorithm, with certain modifications, is backward stable for a large class of structured matrices.
 
Partial pivoting by itself is not sufficient to guarantee numerical stability even for slow algorithms. Moreover, the above transformation-and-pivoting technique is only efficient for fixed-order problems, since the transformations have to be repeated afresh whenever the order changes. Another approach to the numerical stability of the generalized Schur algorithm is to examine the steps of the algorithm directly and to stabilize them without resorting to transformations among matrix structures. This was done in [[#References|[a8]]], where it was shown, in particular, how to implement the hyperbolic rotations in a reliable manner. For all practical purposes, the main conclusion of [[#References|[a8]]] is that the generalized Schur algorithm, with certain modifications, is backward stable for a large class of structured matrices.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  T. Kailath,  S.Y. Kung,  M. Morf,  "Displacement ranks of a matrix"  ''Bull. Amer. Math. Soc.'' , '''1''' :  5  (1979)  pp. 769–773</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  T. Kailath,  A.H. Sayed,  "Displacement structure: Theory and applications"  ''SIAM Review'' , '''37'''  (1995)  pp. 297–386</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  T. Kailath,  "Displacement structure and array algorithms"  T. Kailath (ed.)  A.H. Sayed (ed.) , ''Fast Reliable Algorithms for Matrices with Structure'' , SIAM (Soc. Industrial Applied Math.)  (1999)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  G. Heinig,  K. Rost,  "Algebraic methods for Toeplitz-like matrices and operators" , Akad.  (1984)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  H. Lev-Ari,  T. Kailath,  "Triangular factorization of structured Hermitian matrices"  I. Gohberg (ed.)  et al. (ed.) , ''Schur Methods in Operator Theory and Signal Processing'' , Birkhäuser  (1986)  pp. 301–324</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  G. Heinig,  "Inversion of generalized Cauchy matrices and other classes of structured matrices"  A. Bojanczyk (ed.)  G. Cybenko (ed.) , ''Linear Algebra for Signal Processing'' , ''IMA volumes in Mathematics and its Applications'' , '''69'''  (1994)  pp. 95–114</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  I. Gohberg,  T. Kailath,  V. Olshevsky,  "Fast Gaussian elimination with partial pivoting for matrices with displacement structure"  ''Math. Comput.'' , '''64'''  (1995)  pp. 1557–1576</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  S. Chandrasekaran,  A.H. Sayed,  "Stabilizing the generalized Schur algorithm"  ''SIAM J. Matrix Anal. Appl.'' , '''17'''  (1996)  pp. 950–983</TD></TR></table>
+
<table><tr><td valign="top">[a1]</td> <td valign="top">  T. Kailath,  S.Y. Kung,  M. Morf,  "Displacement ranks of a matrix"  ''Bull. Amer. Math. Soc.'' , '''1''' :  5  (1979)  pp. 769–773</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  T. Kailath,  A.H. Sayed,  "Displacement structure: Theory and applications"  ''SIAM Review'' , '''37'''  (1995)  pp. 297–386</td></tr><tr><td valign="top">[a3]</td> <td valign="top">  T. Kailath,  "Displacement structure and array algorithms"  T. Kailath (ed.)  A.H. Sayed (ed.) , ''Fast Reliable Algorithms for Matrices with Structure'' , SIAM (Soc. Industrial Applied Math.)  (1999)</td></tr><tr><td valign="top">[a4]</td> <td valign="top">  G. Heinig,  K. Rost,  "Algebraic methods for Toeplitz-like matrices and operators" , Akad.  (1984)</td></tr><tr><td valign="top">[a5]</td> <td valign="top">  H. Lev-Ari,  T. Kailath,  "Triangular factorization of structured Hermitian matrices"  I. Gohberg (ed.)  et al. (ed.) , ''Schur Methods in Operator Theory and Signal Processing'' , Birkhäuser  (1986)  pp. 301–324</td></tr><tr><td valign="top">[a6]</td> <td valign="top">  G. Heinig,  "Inversion of generalized Cauchy matrices and other classes of structured matrices"  A. Bojanczyk (ed.)  G. Cybenko (ed.) , ''Linear Algebra for Signal Processing'' , ''IMA volumes in Mathematics and its Applications'' , '''69'''  (1994)  pp. 95–114</td></tr><tr><td valign="top">[a7]</td> <td valign="top">  I. Gohberg,  T. Kailath,  V. Olshevsky,  "Fast Gaussian elimination with partial pivoting for matrices with displacement structure"  ''Math. Comput.'' , '''64'''  (1995)  pp. 1557–1576</td></tr><tr><td valign="top">[a8]</td> <td valign="top">  S. Chandrasekaran,  A.H. Sayed,  "Stabilizing the generalized Schur algorithm"  ''SIAM J. Matrix Anal. Appl.'' , '''17'''  (1996)  pp. 950–983</td></tr></table>

Latest revision as of 20:20, 13 January 2021

Many problems in engineering and applied mathematics ultimately require the solution of $n \times n$ linear systems of equations. For small-size problems, there is often not much else to do except to use one of the already standard methods of solution, such as Gaussian elimination (cf. also Gauss method). However, in many applications, $n$ can be very large (, ) and, moreover, the linear equations may have to be solved over and over again, with different problem/model parameters, till a satisfactory solution to the original physical problem is obtained. In such cases, the $O ( n ^ { 3 } )$ burden, i.e. the number of flops required to solve an $n \times n$ linear system of equations, can become prohibitively large. This is one reason why one seeks in various classes of applications to identify special/characteristic matrix structures that may be assumed in order to reduce the computational burden.

The most obvious structures are those that involve explicit patterns among the matrix entries, such as Toeplitz, Hankel, Vandermonde, Cauchy, and Pick matrices. Several fast algorithms have been devised over the years to exploit these special structures. However, even more common than these explicit matrix structures are matrices in which the structure is implicit. For example, in certain least-squares problems one often encounters products of Toeplitz matrices; these products are not generally Toeplitz, but, on the other hand, they are not "unstructured" . Similarly, in probabilistic calculations the matrix of interest is often not a Toeplitz matrix, but rather its inverse, which is rarely Toeplitz itself, but of course is not unstructured: its inverse is Toeplitz. It is well-known that $O ( n ^ { 2 } )$ flops suffice to solve linear systems of equations with an $n \times n$ Toeplitz coefficient matrix; a question is whether one will need $O ( n ^ { 3 } )$ flops to invert a non-Toeplitz coefficient matrix whose inverse is known to be Toeplitz? When pressed, one's response clearly must be that it is conceivable that $O ( n ^ { 2 } )$ flops will suffice, and this is in fact true.

Such problems suggest the need for a quantitative way of defining and identifying structure in matrices. Over the years, starting with [a1], it was found that an elegant and useful way to do so is the concept of displacement structure. This concept has also been useful for a host of problems apparently far removed from the solution of linear equations, such as the study of constrained and unconstrained rational interpolation, maximum entropy extension, signal detection, digital filter design, non-linear Riccati differential equations, inverse scattering, certain Fredholm integral equations, etc. (see [a3], [a2] and the many references therein).

For motivation, consider an $n \times n$ Hermitian Toeplitz matrix $T = ( c _ { i - j} ) _ { i , j=0 } ^ { n - 1 } $, $c _ { i } = c _ { - i } ^ { * }$. Since such matrices are completely specified by $n$ entries, rather than $n ^ { 2 }$, one would of course expect a reduction in computational effort for handling problems involving such matrices. However, exploiting the Toeplitz structure is apparently more difficult than it may at first seem. To see this, consider the simple case of a real symmetric $3 \times 3$ Toeplitz matrix and apply the first step of the Gaussian elimination procedure to it, namely

where the so-called Schur complement matrix $\Delta$ is seen to be

\begin{equation*} \Delta = \frac { 1 } { c_0 } \left( \begin{array} { c c c } { c_0^ { 2 } - c_ { 1 } ^ { 2 } } & { \square } & { c _ { 1 } c_{0} - c _ { 1 } c _ { 2 } } \\ { c _ { 1 } c _ { 0 } - c _ { 1 } c _ { 2 } } & { \square } & { c _ { 0 } ^ { 2 } - c _ { 2 } ^ { 2 } } \end{array} \right). \end{equation*}

However, $\Delta$ is no longer Toeplitz, so the special structure is lost in the very first step of the procedure. The fact is that what is preserved is not the Toeplitz structure, but a deeper notion called "displacement structure" .

There are several forms of displacement structure, the earliest of which is the following [a1]. Consider an $n \times n$ Hermitian matrix $R$ and the $n \times n$ lower-triangular shift matrix $Z$ with ones on the first subdiagonal and zeros elsewhere (i.e., a lower-triangular Jordan block with eigenvalues at $0$). The displacement of $R$ with respect to $Z$, denoted by $\nabla _ { Z } R$, is defined as the difference

\begin{equation*} \nabla _ { Z } R = R - Z R Z ^ { * }. \end{equation*}

The matrix $R$ is said to have displacement structure (or to be of low displacement rank) with respect to $Z$ if the rank of $\nabla _ { Z } R$ is considerably lower than (and independent of) $n$. For example, a Hermitian Toeplitz matrix has displacement rank $2$ with respect to $Z$.

More generally, let $r \ll n$ denote the rank of $\nabla _ { Z } R$. Then one can write $\nabla _ { Z } R$ as $\nabla _ { Z } R = G J G ^ { * }$, where $G$ is an $( n \times r )$-matrix and $J$ is a "signature" matrix of the form $J = ( I _ { p } \oplus - l _ { q } )$ with $p + q = r$. This representation is highly non-unique, since $G$ can be replaced by $G \Theta$ for any $J$-unitary matrix $\Theta$, i.e. for any $\Theta$ such that $\Theta J \Theta ^ { * } = J$; this flexibility is actually very useful. The matrix $G$ is said to be a generator matrix for $R$ since, along with $\{ Z , J \}$, it completely identifies $R$. If one labels the columns of $G$ as

\begin{equation*} G = \left( \begin{array} { c c c c c c c } { x _ { 0 } } & { \square \ldots \square} & { x _ { p - 1 } } & { y _ { 0 } } & { \square \ldots \square } & { y _ { q - 1 } } \end{array} \right), \end{equation*}

and lets $L ( x )$ denote a lower-triangular Toeplitz matrix whose first column is $x$, then it can be seen that the unique $R$ that solves the displacement equation $\nabla _ { Z } R = G J G ^ { * }$ is given by

\begin{equation*} R = \sum _ { i = 0 } ^ { n - 1 } Z ^ { i } G J G ^ { * } Z ^ { * i } = \end{equation*}

\begin{equation*} = \sum _ { i = 0 } ^ { p - 1 } L ( x _ { i } ) L ^ { * } ( x _ { i } ) - \sum _ { i = 0 } ^ { q - 1 } L ( y _ { i } ) L ^ { * } ( y _ { i } ). \end{equation*}

Such displacement representations of $R$ as a combination of products of lower- and upper-triangular Toeplitz matrices allow, for example, bilinear forms such as $x ^ { * } R y$ to be rapidly evaluated via convolutions (and hence fast Fourier transforms).

As mentioned above, a general Toeplitz matrix has displacement rank $2$, with in fact $p = 1$ and $q = 1$. But there are interesting non-Toeplitz matrices with $p = 1 = q$, for example the inverse of a Toeplitz matrix. In fact, this is a special case of the following fundamental result: If $R$ is invertible and satisfies $R - Z R Z ^ { * } = G J G ^ { * }$, then there exists a $\widetilde { H }$ such that $R ^ { - 1 } - Z ^ { * } R ^ { - 1 } Z = \widetilde{ H } \square ^ { * } J \widetilde { H }$. The fact that $Z$ and $ Z ^ { * }$ are interchanged in the latter formula suggests that one can define a so-called "natural" inverse, $R ^ { - \# } = \tilde{I} R ^ { - 1 } \tilde{I}$, where $\tilde{I}$ is the "reverse" identity matrix (with ones on the anti-diagonal). For then one sees (with $H = \tilde{I} \tilde { H } \square ^{*}$) that

\begin{equation*} R ^ { - \# } - Z R ^ { - \# } Z ^ { * } = H J H ^ { * }. \end{equation*}

Therefore, $R$ and $R ^ { - \# }$ have the same displacement rank and (when $R$ is Hermitian) the same displacement inertia (since $J$ is the same). (A real symmetric Toeplitz matrix $T$ has $T ^ { - 1 } = T ^ { - \# }$, so that $T ^ { - 1 }$ has a representation of the form $T ^ { - 1 } = L ( x ) L ^ { * } ( x ) - L ( y ) L ^ { * } ( y )$, which after suitably identifying $\{ x , y \}$ is a special so-called Gohberg–Semencul formula.) The proof of the above fundamental result is very simple (see, e.g., [a3]) and in fact holds with $Z$ replaced by any matrix $F$.

An interesting example is obtained by choosing $F = \operatorname { diag } \{ f _ { 0 } , \dots , f _ { n - 1 } \}$, $| f _ { i } | < 1$, when the solution $R$ of the displacement equation $R - F R F ^ { * } = G J G ^ { * }$ becomes a "Pick" matrix of the form

\begin{equation*} P = \left( \frac { u _ { i } u _ { j } ^ { * } - v _ { i } v _ { j } ^ { * } } { 1 - f _ { i } f _ { j } ^ { * } } \right) _ { i , j = 0 } ^ { n - 1 }, \end{equation*}

where $\{ u_ { i } , v _ { i } \}$ are $1 \times p$ and $1 \times q$ vectors. Pick matrices arise in solving analytic interpolation problems and the displacement theory gives new and efficient computational algorithms for solving a variety of such problems (see, e.g., [a2] and Nevanlinna–Pick interpolation).

One can handle non-Hermitian structured matrices by using displacement operators of the form $\nabla _ { F, A} R = R - F R A ^ { * }$. When $F$ is diagonal with distinct entries and $A = Z$, $R$ becomes a Vandermonde matrix. A closely related displacement operator, first introduced in [a4], has the form $F R - R A ^ { * }$. Choosing $F = \operatorname { diag } \{ f _ { i } \}$, $A = \operatorname { diag } \{ a _ { i } \}$ leads to Cauchy-like solutions of the form

\begin{equation*} C _ { l } = \left( \frac { u _ { i } v _ { j } ^ { * } } { f _ { i } - a _ { j } ^ { * } } \right) , u _ { i } , v _ { i } \in \mathcal C ^ { 1 \times r }. \end{equation*}

The name comes from the fact that when $r = 1$ and $u _ { 0 } = 1 = v _ { 0 }$, $C_l$ is a Cauchy matrix. When $r = 2$, and $u _ { i } = \left( \beta _ { i } \quad 1 \right) $ and $v _ { i } = ( 1 - k _ { i } )$, one gets the so-called Loewner matrix.

It is often convenient to use generating-function language, and to define

\begin{equation*} R ( z , w ) = \sum _ { i , j = 0 } ^ { \infty } R _ { ij } z ^ { i } w ^ { * j }. \end{equation*}

(Finite structured matrices can be extended to be semi-infinite in many natural ways, e.g. by extending the generator matrix by adding additional rows.) In this language, one can introduce general displacement equations of the form

\begin{equation*} d ( z , w ) R ( z , w ) = G ( z ) J G ^ { * } ( w ), \end{equation*}

with

\begin{equation*} d ( z , w ) = \sum _ { i , j = 0 } ^ { \infty } d _ { i j } z ^ { i } w ^ { * j }. \end{equation*}

Choosing $d ( z , w ) = 1 - z w ^ { * }$ corresponds to matrix displacement equations of the form (in an obvious notation), $R - Z R Z ^ { * } = G J G ^ { * }$, while $d ( z , w ) = ( z - w ^ { * } )$ corresponds to $Z R - R Z ^ { * } = G J G ^ { * }$. There are many other useful choices, but to enable recursive matrix factorizations it is necessary to restrict $d ( z , w )$ to the form $d ( z , w ) = \alpha ( z ) \alpha ^ { * } ( w ) - \beta ( z ) \beta ^ { * } ( w )$. Note, in particular, that for $R = I$ one gets $R ( z , w ) = 1 / ( 1 - z w ^ { * } )$, the Szegö kernel. Other choices lead to the various so-called de Branges and Bergman kernels. See [a5] and the references therein.

A central feature in many of the applications mentioned earlier is the ability to efficiently obtain the so-called triangular LDU-factorization of a matrix and of its inverse (cf. also Matrix factorization). Among purely matrix computations, one may mention that this enables fast determination of the so-called QR-decomposition and the factorization of composite matrices such as $T _ { 1 } T _ { 2 } ^ { - 1 } T _ { 3 }$, $T _ { i }$ being Toeplitz. The LDU-factorization of structured matrices is facilitated by the fact that Schur complements inherit displacement structure. For example, writing

\begin{equation*} R = \left( \begin{array} { l l } { R _ { 11 } } & { R _ { 12 } } \\ { R _ { 21 } } & { R _ { 22 } } \end{array} \right), F = \left( \begin{array} { l l } { F _ { 1 } } & { 0 } \\ { F _ { 2 } } & { F _ { 3 } } \end{array} \right), \end{equation*}

it turns out that

\begin{equation*} \operatorname { rank } ( S - F _ { 3 } S F _ { 3 } ^ { * } ) \leq \operatorname { rank } ( R - F R F ^ { * } ), \end{equation*}

where $S = R _ { 22 } - R _ { 21 } R _ { 11 } ^ { - 1 } R _ { 12 }$. By properly defining the $\{ R_{ij} \}$ and $\{ F _ { i } \}$, this allows one to find the displacement structure of various composite matrices. For example, choosing $R _ { 11 } = - T$, $R _ { 12 } = I = R _ { 21 }$, $R _ { 22 } = 0$, and $F = Z \oplus Z$, gives the previously mentioned result on inverses of structured matrices.

Computations on structured matrices are made efficient by working not with the $n ^ { 2 }$ entries of $R$, but with the $n r$ entries of the generator matrix $G$. The basic triangular factorization algorithm is Gaussian elimination, which, as noted in the first calculation, amounts to finding a sequence of Schur complements. Incorporating structure into the Gaussian elimination procedure was, in retrospect, first done (in the special case $r = 2$) by I. Schur himself in a remarkable 1917 paper dealing with the apparently very different problem of checking when a power series is bounded in the unit disc.

The algorithm below is but one generalization of Schur's original recursion. It provides an efficient $O ( m ^ { 2 } )$ procedure for the computation of the triangular factors of a Hermitian positive-definite matrix $R$ satisfying

\begin{equation*} R - Z R Z ^ { * } = G J G ^ { * } , G \in \mathcal{C} ^ { n \times r }, \end{equation*}

with $J = ( I _ { p } \oplus - l _ { q } )$. So, let $R = L D ^ { - 1 } L ^ { * }$ denote the triangular decomposition of $R$, where $D = \operatorname { diag } \{ d _ { 0 } , \dots , d _ { n - 1 } \}$, and the lower triangular factor $L$ is normalized in such a way that the $\{ d _ { i } \}$ appear on its main diagonal. The non-zero part of the consecutive columns of $L$ will be further denoted by $l_i$. Then it holds that the successive Schur complements of $R$ with respect to its leading $( i \times i )$-submatrices, denoted by $R_i$, are also structured and satisfy

\begin{equation*} R _ { i } - Z _ { i } R _ { i } Z _ { i } ^ { * } = G _ { i } J G _ { i } ^ { * }, \end{equation*}

with $( n - i ) \times ( n - i )$ lower-triangular shift matrices $Z_{i}$, and where the generator matrices $G_i$ are obtained by the following recursive construction:

start with $G _ { 0 } = G$;

repeat for $i \geq 0$:

\begin{equation*} \left( \begin{array} { c } { 0 } \\ { G _ { i + 1 } } \end{array} \right) = \left\{ G _ { i } + Z _ { i } G _ { i } \frac { J g _ { i } ^ { * } g _ { i } } { g _ { i } J g _ { i } ^ { * } } \right\} \Theta _ { i }, \end{equation*}

where $\Theta_i$ is an arbitrary $J$-unitary matrix, and $g_i$ denotes the top row of $G_i$. Then

The degree of freedom in choosing $\Theta_i$ is often very useful. One particular choice leads to the so-called proper form of the generator recursion. Let $\Theta_i$ reduce $g_i$ to the form

\begin{equation*} g _ { i } \Theta _ { i } = \left( \begin{array} { l l l } { \delta _ { i } } & { 0 } & { \ldots } & { 0 } \end{array} \right), \end{equation*}

with a non-zero scalar entry in the leading position. Then

\begin{equation*} \left( \begin{array} { c } { 0 } \\ { G _ { i + 1 } } \end{array} \right) = Z _ { i } G _ { i } \Theta _ { i } \left( \begin{array} { c c } { 1 } & { 0 } \\ { 0 } & { 0 } \end{array} \right) + G _ { i } \Theta _ { i } \left( \begin{array} { c c } { 0 } & { 0 } \\ { 0 } & { I _ { p + q - 1 } } \end{array} \right), \end{equation*}

with

\begin{equation*} l _ { i } = \delta _ { i } ^ { * } G _ { i } \Theta _ { i } \left( \begin{array} { c } { 1 } \\ { 0 } \end{array} \right) , d _ { i } = | \delta _ { i } | ^ { 2 }. \end{equation*}

In words, this shows that $G _ { i+1 } $ can be obtained from $G_i$ as follows:

1) reduce $g_i$ to proper form;

2) multiply $G_i$ by $\Theta_i$ and keep the last columns of $ { G } _ { i } \Theta _ { i }$ unaltered;

3) shift down the first column of $ { G } _ { i } \Theta _ { i }$ by one position.

Extensions of the algorithm to more general structured matrices are possible (see, e.g., [a2], [a3]). In addition, the algorithm can be extended to provide simultaneous factorizations of both a matrix $R$ and its inverse, $R ^ { - 1 }$; see, e.g., [a3].

An issue that arises in the study of such fast factorization algorithms is their numerical stability in finite-precision implementations. It was mentioned earlier that the generalized Schur algorithm amounts to combining Gaussian elimination with structure, and it is well-known that Gaussian elimination in its purest form is numerically unstable (meaning that the error in the factorization can be quite large, where $\hat{L}$ and $\hat{D}$ denote the computed $L$ and $D$, respectively). The instability can often be controlled by resorting to pivoting techniques, i.e., by permuting the order of the rows, and perhaps columns, of the matrices before the Gaussian elimination steps. However, pivoting can destroy matrix structure and can thus lead to a loss in computational efficiency. It was observed in [a6], though, that for diagonal displacement operators $F$, and more specifically for Cauchy-like matrices, partial pivoting does not destroy matrix structure. This fact was exploited in [a7] in the context of the generalized Schur algorithm and applied to other structured matrices. This was achieved by showing how to transform different kinds of matrix structure into Cauchy-like structure.

Partial pivoting by itself is not sufficient to guarantee numerical stability even for slow algorithms. Moreover, the above transformation-and-pivoting technique is only efficient for fixed-order problems, since the transformations have to be repeated afresh whenever the order changes. Another approach to the numerical stability of the generalized Schur algorithm is to examine the steps of the algorithm directly and to stabilize them without resorting to transformations among matrix structures. This was done in [a8], where it was shown, in particular, how to implement the hyperbolic rotations in a reliable manner. For all practical purposes, the main conclusion of [a8] is that the generalized Schur algorithm, with certain modifications, is backward stable for a large class of structured matrices.

References

[a1] T. Kailath, S.Y. Kung, M. Morf, "Displacement ranks of a matrix" Bull. Amer. Math. Soc. , 1 : 5 (1979) pp. 769–773
[a2] T. Kailath, A.H. Sayed, "Displacement structure: Theory and applications" SIAM Review , 37 (1995) pp. 297–386
[a3] T. Kailath, "Displacement structure and array algorithms" T. Kailath (ed.) A.H. Sayed (ed.) , Fast Reliable Algorithms for Matrices with Structure , SIAM (Soc. Industrial Applied Math.) (1999)
[a4] G. Heinig, K. Rost, "Algebraic methods for Toeplitz-like matrices and operators" , Akad. (1984)
[a5] H. Lev-Ari, T. Kailath, "Triangular factorization of structured Hermitian matrices" I. Gohberg (ed.) et al. (ed.) , Schur Methods in Operator Theory and Signal Processing , Birkhäuser (1986) pp. 301–324
[a6] G. Heinig, "Inversion of generalized Cauchy matrices and other classes of structured matrices" A. Bojanczyk (ed.) G. Cybenko (ed.) , Linear Algebra for Signal Processing , IMA volumes in Mathematics and its Applications , 69 (1994) pp. 95–114
[a7] I. Gohberg, T. Kailath, V. Olshevsky, "Fast Gaussian elimination with partial pivoting for matrices with displacement structure" Math. Comput. , 64 (1995) pp. 1557–1576
[a8] S. Chandrasekaran, A.H. Sayed, "Stabilizing the generalized Schur algorithm" SIAM J. Matrix Anal. Appl. , 17 (1996) pp. 950–983
How to Cite This Entry:
Displacement structure. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Displacement_structure&oldid=16958
This article was adapted from an original article by Thomas KailathAli H. Sayed (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article