Namespaces
Variants
Actions

Difference between revisions of "Gauss method"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
g0434801.png
 +
$#A+1 = 106 n = 0
 +
$#C+1 = 106 : ~/encyclopedia/old_files/data/G043/G.0403480 Gauss method
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
A method of successive elimination of unknowns in solving a set of linear equations, first introduced by C.F. Gauss [[#References|[1]]]. Let there be given a system
 
A method of successive elimination of unknowns in solving a set of linear equations, first introduced by C.F. Gauss [[#References|[1]]]. Let there be given a system
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434801.png" /></td> </tr></table>
+
$$ \tag{$S  ^ {0$} }
 +
f _ {j} ( x) - a _ {j}  = \
 +
a _ {j1} x _ {1} + \dots + a _ {jn} x _ {n} - a _ {j}  = 0,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434802.png" /></td> </tr></table>
+
$$
 +
= 1 \dots m,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434803.png" /> are elements of an arbitrary field <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434804.png" />. Without loss of generality one may assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434805.png" />. The Gauss method operates as follows. One subtracts the first equation of the set, multiplied by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434806.png" /> term by term, from the second equation; then one subtracts the first equation multiplied by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434807.png" /> from the third equation; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434808.png" /> then one subtracts the first equation multiplied by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g0434809.png" /> from the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348010.png" />-th equation. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348011.png" /> be the system of subtracted equations thus obtained. If a coefficient in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348012.png" /> is non-zero (possibly after changing the sequence of equations and unknowns), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348013.png" /> is treated as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348014.png" />, etc. If the rank <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348015.png" /> of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348016.png" /> (i.e. the rank of the matrix of its coefficients) is smaller than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348017.png" />, then the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348018.png" />-th step will yield a system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348019.png" /> in which all coefficients at the unknowns are zero; if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348020.png" />, the system is said to be empty. The system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348021.png" /> is compatible if and only if the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348022.png" /> is either compatible (i.e. has no non-zero free terms) or is empty.
+
where $  a _ {ji} , a _ {j} $
 +
are elements of an arbitrary field $  P $.  
 +
Without loss of generality one may assume that $  a _ {11} \neq 0 $.  
 +
The Gauss method operates as follows. One subtracts the first equation of the set, multiplied by $  a _ {21} / a _ {11} $
 +
term by term, from the second equation; then one subtracts the first equation multiplied by $  a _ {31} / a _ {11} $
 +
from the third equation; $  \dots, $
 +
then one subtracts the first equation multiplied by $  a _ {m1} / a _ {11} $
 +
from the $  m $-
 +
th equation. Let $  S  ^ {1} $
 +
be the system of subtracted equations thus obtained. If a coefficient in $  S  ^ {1} $
 +
is non-zero (possibly after changing the sequence of equations and unknowns), $  S  ^ {1} $
 +
is treated as $  S  ^ {0} $,  
 +
etc. If the rank $  r $
 +
of the system $  S  ^ {0} $(
 +
i.e. the rank of the matrix of its coefficients) is smaller than $  m $,  
 +
then the $  r $-
 +
th step will yield a system $  S  ^ {r} $
 +
in which all coefficients at the unknowns are zero; if $  r = m $,  
 +
the system is said to be empty. The system $  S  ^ {0} $
 +
is compatible if and only if the system $  S  ^ {r} $
 +
is either compatible (i.e. has no non-zero free terms) or is empty.
  
The process of obtaining one solution of a (compatible) system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348023.png" /> may be described as follows. One takes some solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348024.png" /> of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348025.png" />. By assigning the values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348026.png" /> to the unknowns <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348027.png" /> in some equations of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348028.png" /> with a non-zero coefficient at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348029.png" /> (e.g. to its first equation), one finds <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348030.png" /> and obtains a solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348031.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348032.png" />. In other words, the value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348033.png" /> is obtained from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348034.png" /> by replacing the unknowns <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348035.png" /> by the values taken for them. The values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348036.png" /> are then substituted in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348037.png" />, the value <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348038.png" /> is found, and hence the solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348039.png" />, etc. The values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348040.png" /> thus found, together with the values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348041.png" /> taken, constitute a solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348042.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348043.png" /> [[#References|[2]]].
+
The process of obtaining one solution of a (compatible) system $  S  ^ {0} $
 +
may be described as follows. One takes some solution $  ( x _ {r}  ^ {0} \dots x _ {n}  ^ {0} ) $
 +
of the system $  S  ^ {r-} 1 $.  
 +
By assigning the values $  x _ {r}  ^ {0} \dots x _ {n}  ^ {0} $
 +
to the unknowns $  x _ {r} \dots x _ {n} $
 +
in some equations of the system $  S  ^ {r-} 2 $
 +
with a non-zero coefficient at $  x _ {r-} 1 $(
 +
e.g. to its first equation), one finds $  x _ {r-} 1 = x _ {r-} 1  ^ {0} $
 +
and obtains a solution $  ( x _ {r-} 1  ^ {0} , x _ {r}  ^ {0} \dots x _ {n}  ^ {0} ) $
 +
of $  S  ^ {r-} 2 $.  
 +
In other words, the value $  x _ {r-} 1  ^ {0} $
 +
is obtained from $  S  ^ {r-} 2 $
 +
by replacing the unknowns $  x _ {r} \dots x _ {n} $
 +
by the values taken for them. The values $  x _ {r-} 1  ^ {0} \dots x _ {n}  ^ {0} $
 +
are then substituted in $  S  ^ {r-} 3 $,  
 +
the value $  x _ {r-} 2 = x _ {r-} 2  ^ {0} $
 +
is found, and hence the solution $  ( x _ {r-} 2  ^ {0} \dots x _ {n}  ^ {0} ) $,  
 +
etc. The values $  x _ {1}  ^ {0} \dots x _ {r-} 1  ^ {0} $
 +
thus found, together with the values $  x _ {r}  ^ {0} \dots x _ {n}  ^ {0} $
 +
taken, constitute a solution $  ( x _ {1}  ^ {0} \dots x _ {n}  ^ {0} ) $
 +
of $  S  ^ {0} $[[#References|[2]]].
  
This method may be generalized as follows [[#References|[4]]]. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348044.png" /> be some subspace of the vector space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348045.png" /> and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348046.png" /> be the set of all solutions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348047.png" /> of the equation
+
This method may be generalized as follows [[#References|[4]]]. Let $  U $
 +
be some subspace of the vector space $  P  ^ {n} $
 +
and let $  P  ^ {m} ( U) $
 +
be the set of all solutions $  ( p _ {1} \dots p _ {m} ) $
 +
of the equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348048.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+
$$ \tag{* }
 +
p _ {1} f _ {1} ( x) + \dots + p _ {m} f _ {m} ( x)  = 0,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348049.png" /> runs through <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348050.png" />. For an arbitrary finite system
+
where $  x $
 +
runs through $  U $.  
 +
For an arbitrary finite system
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348051.png" /></td> </tr></table>
+
$$
 +
p  ^ {i}  = ( p _ {1}  ^ {i} \dots p _ {m}  ^ {i} ),\ \
 +
i = 1 \dots l ,
 +
$$
  
of non-zero generating elements of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348052.png" /> it is possible to construct a system
+
of non-zero generating elements of $  P  ^ {m} ( U) $
 +
it is possible to construct a system
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348053.png" /></td> </tr></table>
+
$$
 +
\sum _ {j = 1 } ^ { m }
 +
p _ {j}  ^ {i} f _ {j} ( x) -
 +
\sum _ {j = 1 } ^ { m }  p _ {j}  ^ {i} a _ {j}  = 0,\ \
 +
i = 1 \dots l
 +
$$
  
(where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348054.png" /> is the unknown), known as the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348056.png" />-convolution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348057.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348058.png" /> contains no non-zero elements, one says that the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348059.png" /> has an empty <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348061.png" />-convolution. If the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348062.png" /> is compatible, then the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348063.png" />-convolution of any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348064.png" /> is compatible or is empty. It was found that, for the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348065.png" /> to be compatible, it is sufficient that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348066.png" />-convolution is empty or compatible for only one <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348067.png" />. Again, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348068.png" /> be subspaces generated in the space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348069.png" /> by the vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348070.png" />. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348071.png" /> equation (*) is reduced to the equation
+
(where $  x $
 +
is the unknown), known as the $  U $-
 +
convolution of $  S  ^ {0} $.  
 +
If $  P  ^ {m} ( U) $
 +
contains no non-zero elements, one says that the system $  S  ^ {0} $
 +
has an empty $  U $-
 +
convolution. If the system $  S  ^ {0} $
 +
is compatible, then the $  U $-
 +
convolution of any $  U $
 +
is compatible or is empty. It was found that, for the system $  S  ^ {0} $
 +
to be compatible, it is sufficient that the $  U $-
 +
convolution is empty or compatible for only one $  U $.  
 +
Again, let $  U _ {1} \dots U _ {n} $
 +
be subspaces generated in the space $  P  ^ {n} $
 +
by the vectors $  e _ {1} = ( 1, 0 \dots 0) \dots e _ {n} = ( 0, 0 \dots 1) $.  
 +
For $  U = U _ {i} $
 +
equation (*) is reduced to the equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348072.png" /></td> </tr></table>
+
$$
 +
a _ {1i} p _ {1} + \dots + a _ {mi} p _ {m}  = 0.
 +
$$
  
Let, for example, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348073.png" />. If also <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348074.png" />, then the vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348075.png" /> may be taken as the non-zero generating elements of the space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348076.png" />, and the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348077.png" />-convolution of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348078.png" /> will coincide with the procedure of elimination of the unknown <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348079.png" /> in Gauss' method.
+
Let, for example, $  i = 1 $.  
 +
If also $  a _ {11} \neq 0 $,  
 +
then the vectors $  ( a _ {21} / a _ {11} , - 1, 0 \dots 0) \dots ( a _ {m1} / a _ {11} , 0, 0 \dots - 1) $
 +
may be taken as the non-zero generating elements of the space $  P  ^ {m} ( U _ {1} ) $,  
 +
and the $  U _ {1} $-
 +
convolution of the system $  S  ^ {0} $
 +
will coincide with the procedure of elimination of the unknown $  x _ {1} $
 +
in Gauss' method.
  
The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348080.png" />-convolution of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348081.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348082.png" /> is a procedure for the simultaneous elimination of two unknowns <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348083.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348084.png" />. For example, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348085.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348086.png" />. If also
+
The $  U $-
 +
convolution of the system $  S  ^ {0} $
 +
for $  U = U _ {i} + U _ {k} $
 +
is a procedure for the simultaneous elimination of two unknowns $  x _ {i} $
 +
and $  x _ {k} $.  
 +
For example, let $  i = 1 $
 +
and $  k = 2 $.  
 +
If also
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348087.png" /></td> </tr></table>
+
$$
 +
\Delta _ {12}  = \left |
 +
 
 +
\begin{array}{ll}
 +
a _ {11}  &a _ {21}  \\
 +
a _ {12}  &a _ {22}  \\
 +
\end{array}
 +
 
 +
\right |  \neq  0,
 +
$$
  
 
then the rows of the matrix
 
then the rows of the matrix
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348088.png" /></td> </tr></table>
+
$$
 +
\left \|
 +
 
 +
\begin{array}{cccccc}
 +
- \Delta _ {23}  &\Delta _ {13}  &- \Delta _ {12}  & 0  &\dots  & 0  \\
 +
- \Delta _ {24}  &\Delta _ {14}  & 0  &- \Delta _ {12}  &\dots  & 0 \\
 +
\dots  &\dots  &\dots  &\dots  &{}  &\dots  \\
 +
- \Delta _ {2m}  &\Delta _ {1m}  & 0  & 0  &\dots  &- \Delta _ {12}  \\
 +
\end{array}
 +
 
 +
\right \| ,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348089.png" /></td> </tr></table>
+
$$
 +
\Delta _ {rs}  = \
 +
\left |
  
may be used to obtain the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348090.png" />-convolution of the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348091.png" />. By a suitable elimination of individual unknowns with eliminations of certain pairs (or, generally, samples) of unknowns, the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348092.png" /> may be solved by using algorithms which are generalizations of the Gauss method.
+
\begin{array}{ll}
 +
a _ {r1}  &a _ {s1}  \\
 +
a _ {r2}  &a _ {s2}  \\
 +
\end{array}
 +
 
 +
\right | ,
 +
$$
 +
 
 +
may be used to obtain the $  ( U _ {1} + U _ {2} ) $-
 +
convolution of the system $  S  ^ {0} $.  
 +
By a suitable elimination of individual unknowns with eliminations of certain pairs (or, generally, samples) of unknowns, the system $  S  ^ {0} $
 +
may be solved by using algorithms which are generalizations of the Gauss method.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  C.F. Gauss,  "Beiträge zur Theorie der algebraischen Gleichungen" , ''Werke'' , '''3''' , K. Gesellschaft Wissenschaft. Göttingen  (1876)  pp. 71–102</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.G. Kurosh,  "Higher algebra" , MIR  (1972)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  S.N. Chernikov,  "Lineare Ungleichungen" , Deutsch. Verlag Wissenschaft.  (1971)  (Translated from Russian)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  C.F. Gauss,  "Beiträge zur Theorie der algebraischen Gleichungen" , ''Werke'' , '''3''' , K. Gesellschaft Wissenschaft. Göttingen  (1876)  pp. 71–102</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.G. Kurosh,  "Higher algebra" , MIR  (1972)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  D.K. Faddeev,  V.N. Faddeeva,  "Computational methods of linear algebra" , Freeman  (1963)  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  S.N. Chernikov,  "Lineare Ungleichungen" , Deutsch. Verlag Wissenschaft.  (1971)  (Translated from Russian)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====
 
There are a number of variants of this method, mostly based on practical implementation considerations (like the methods of Crout and Doolittle) or efficiency (like the method of Choleski for symmetric systems).
 
There are a number of variants of this method, mostly based on practical implementation considerations (like the methods of Crout and Doolittle) or efficiency (like the method of Choleski for symmetric systems).
  
In the Western literature, the notions of LU-decomposition, forward elimination and back substitution are often associated with Gauss' method (which is also called the Gaussian elimination method). Consider the particular case where the matrix of coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348094.png" /> in the system <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348095.png" /> is a square matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348096.png" />. Then LU-decomposition refers to the decomposition of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348097.png" /> into a lower- and upper-triangular matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348098.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g04348099.png" />, i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480100.png" />. Forward elimination and back substitution are understood to be the solution of the triangular systems <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480101.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480102.png" />, respectively. By using the concept of a flop (as introduced by C.B. Moler), that is, the amount of work involved in performing a floating point addition, a floating point multiplication and a little subscripting, it can be shown that LU-decomposition, forward elimination and back substitution require <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480103.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480104.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480105.png" /> flops, respectively.
+
In the Western literature, the notions of LU-decomposition, forward elimination and back substitution are often associated with Gauss' method (which is also called the Gaussian elimination method). Consider the particular case where the matrix of coefficients $  A $
 +
in the system $  A x = a $
 +
is a square matrix $  ( m = n ) $.  
 +
Then LU-decomposition refers to the decomposition of $  A $
 +
into a lower- and upper-triangular matrix $  L $
 +
and $  U $,  
 +
i.e., $  A = L U $.  
 +
Forward elimination and back substitution are understood to be the solution of the triangular systems $  L y = a $
 +
and $  U x = y $,  
 +
respectively. By using the concept of a flop (as introduced by C.B. Moler), that is, the amount of work involved in performing a floating point addition, a floating point multiplication and a little subscripting, it can be shown that LU-decomposition, forward elimination and back substitution require $  m  ^ {3} / 3 $,  
 +
$  m  ^ {2} / 2 $
 +
and $  m  ^ {2} / 2 $
 +
flops, respectively.
  
For numerical purposes it is not sufficient to ensure that the coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480106.png" />, etc., the pivots, are just non-zero, but they are the best possible; if the absolutely largest <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480107.png" /> is used as a pivot, then this is called partial pivoting. If the absolutely largest element in the entire matrix (or submatrix at later stages) is used as a pivot, this is being called complete pivoting. Partial pivoting to make LU-decomposition possible in case the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480108.png" /> has singular leading principal submatrices amounts to interchanging rows. It requires <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g043/g043480/g043480109.png" /> comparisons. With such a strategy the method can be shown to be numerically stable (see [[Stability of a computational algorithm|Stability of a computational algorithm]]; [[Stability of a computational process|Stability of a computational process]]).
+
For numerical purposes it is not sufficient to ensure that the coefficients $  a _ {11} $,  
 +
etc., the pivots, are just non-zero, but they are the best possible; if the absolutely largest $  a _ {j} $
 +
is used as a pivot, then this is called partial pivoting. If the absolutely largest element in the entire matrix (or submatrix at later stages) is used as a pivot, this is being called complete pivoting. Partial pivoting to make LU-decomposition possible in case the matrix $  A $
 +
has singular leading principal submatrices amounts to interchanging rows. It requires $  O ( m  ^ {2} ) $
 +
comparisons. With such a strategy the method can be shown to be numerically stable (see [[Stability of a computational algorithm|Stability of a computational algorithm]]; [[Stability of a computational process|Stability of a computational process]]).
  
 
An excellent book on numerical linear algebra is [[#References|[a1]]]. The problem of numerical stability in Gaussian elimination is discussed in [[#References|[a6]]].
 
An excellent book on numerical linear algebra is [[#References|[a1]]]. The problem of numerical stability in Gaussian elimination is discussed in [[#References|[a6]]].

Latest revision as of 19:41, 5 June 2020


A method of successive elimination of unknowns in solving a set of linear equations, first introduced by C.F. Gauss [1]. Let there be given a system

$$ \tag{$S ^ {0$} } f _ {j} ( x) - a _ {j} = \ a _ {j1} x _ {1} + \dots + a _ {jn} x _ {n} - a _ {j} = 0, $$

$$ j = 1 \dots m, $$

where $ a _ {ji} , a _ {j} $ are elements of an arbitrary field $ P $. Without loss of generality one may assume that $ a _ {11} \neq 0 $. The Gauss method operates as follows. One subtracts the first equation of the set, multiplied by $ a _ {21} / a _ {11} $ term by term, from the second equation; then one subtracts the first equation multiplied by $ a _ {31} / a _ {11} $ from the third equation; $ \dots, $ then one subtracts the first equation multiplied by $ a _ {m1} / a _ {11} $ from the $ m $- th equation. Let $ S ^ {1} $ be the system of subtracted equations thus obtained. If a coefficient in $ S ^ {1} $ is non-zero (possibly after changing the sequence of equations and unknowns), $ S ^ {1} $ is treated as $ S ^ {0} $, etc. If the rank $ r $ of the system $ S ^ {0} $( i.e. the rank of the matrix of its coefficients) is smaller than $ m $, then the $ r $- th step will yield a system $ S ^ {r} $ in which all coefficients at the unknowns are zero; if $ r = m $, the system is said to be empty. The system $ S ^ {0} $ is compatible if and only if the system $ S ^ {r} $ is either compatible (i.e. has no non-zero free terms) or is empty.

The process of obtaining one solution of a (compatible) system $ S ^ {0} $ may be described as follows. One takes some solution $ ( x _ {r} ^ {0} \dots x _ {n} ^ {0} ) $ of the system $ S ^ {r-} 1 $. By assigning the values $ x _ {r} ^ {0} \dots x _ {n} ^ {0} $ to the unknowns $ x _ {r} \dots x _ {n} $ in some equations of the system $ S ^ {r-} 2 $ with a non-zero coefficient at $ x _ {r-} 1 $( e.g. to its first equation), one finds $ x _ {r-} 1 = x _ {r-} 1 ^ {0} $ and obtains a solution $ ( x _ {r-} 1 ^ {0} , x _ {r} ^ {0} \dots x _ {n} ^ {0} ) $ of $ S ^ {r-} 2 $. In other words, the value $ x _ {r-} 1 ^ {0} $ is obtained from $ S ^ {r-} 2 $ by replacing the unknowns $ x _ {r} \dots x _ {n} $ by the values taken for them. The values $ x _ {r-} 1 ^ {0} \dots x _ {n} ^ {0} $ are then substituted in $ S ^ {r-} 3 $, the value $ x _ {r-} 2 = x _ {r-} 2 ^ {0} $ is found, and hence the solution $ ( x _ {r-} 2 ^ {0} \dots x _ {n} ^ {0} ) $, etc. The values $ x _ {1} ^ {0} \dots x _ {r-} 1 ^ {0} $ thus found, together with the values $ x _ {r} ^ {0} \dots x _ {n} ^ {0} $ taken, constitute a solution $ ( x _ {1} ^ {0} \dots x _ {n} ^ {0} ) $ of $ S ^ {0} $[2].

This method may be generalized as follows [4]. Let $ U $ be some subspace of the vector space $ P ^ {n} $ and let $ P ^ {m} ( U) $ be the set of all solutions $ ( p _ {1} \dots p _ {m} ) $ of the equation

$$ \tag{* } p _ {1} f _ {1} ( x) + \dots + p _ {m} f _ {m} ( x) = 0, $$

where $ x $ runs through $ U $. For an arbitrary finite system

$$ p ^ {i} = ( p _ {1} ^ {i} \dots p _ {m} ^ {i} ),\ \ i = 1 \dots l , $$

of non-zero generating elements of $ P ^ {m} ( U) $ it is possible to construct a system

$$ \sum _ {j = 1 } ^ { m } p _ {j} ^ {i} f _ {j} ( x) - \sum _ {j = 1 } ^ { m } p _ {j} ^ {i} a _ {j} = 0,\ \ i = 1 \dots l $$

(where $ x $ is the unknown), known as the $ U $- convolution of $ S ^ {0} $. If $ P ^ {m} ( U) $ contains no non-zero elements, one says that the system $ S ^ {0} $ has an empty $ U $- convolution. If the system $ S ^ {0} $ is compatible, then the $ U $- convolution of any $ U $ is compatible or is empty. It was found that, for the system $ S ^ {0} $ to be compatible, it is sufficient that the $ U $- convolution is empty or compatible for only one $ U $. Again, let $ U _ {1} \dots U _ {n} $ be subspaces generated in the space $ P ^ {n} $ by the vectors $ e _ {1} = ( 1, 0 \dots 0) \dots e _ {n} = ( 0, 0 \dots 1) $. For $ U = U _ {i} $ equation (*) is reduced to the equation

$$ a _ {1i} p _ {1} + \dots + a _ {mi} p _ {m} = 0. $$

Let, for example, $ i = 1 $. If also $ a _ {11} \neq 0 $, then the vectors $ ( a _ {21} / a _ {11} , - 1, 0 \dots 0) \dots ( a _ {m1} / a _ {11} , 0, 0 \dots - 1) $ may be taken as the non-zero generating elements of the space $ P ^ {m} ( U _ {1} ) $, and the $ U _ {1} $- convolution of the system $ S ^ {0} $ will coincide with the procedure of elimination of the unknown $ x _ {1} $ in Gauss' method.

The $ U $- convolution of the system $ S ^ {0} $ for $ U = U _ {i} + U _ {k} $ is a procedure for the simultaneous elimination of two unknowns $ x _ {i} $ and $ x _ {k} $. For example, let $ i = 1 $ and $ k = 2 $. If also

$$ \Delta _ {12} = \left | \begin{array}{ll} a _ {11} &a _ {21} \\ a _ {12} &a _ {22} \\ \end{array} \right | \neq 0, $$

then the rows of the matrix

$$ \left \| \begin{array}{cccccc} - \Delta _ {23} &\Delta _ {13} &- \Delta _ {12} & 0 &\dots & 0 \\ - \Delta _ {24} &\Delta _ {14} & 0 &- \Delta _ {12} &\dots & 0 \\ \dots &\dots &\dots &\dots &{} &\dots \\ - \Delta _ {2m} &\Delta _ {1m} & 0 & 0 &\dots &- \Delta _ {12} \\ \end{array} \right \| , $$

where

$$ \Delta _ {rs} = \ \left | \begin{array}{ll} a _ {r1} &a _ {s1} \\ a _ {r2} &a _ {s2} \\ \end{array} \right | , $$

may be used to obtain the $ ( U _ {1} + U _ {2} ) $- convolution of the system $ S ^ {0} $. By a suitable elimination of individual unknowns with eliminations of certain pairs (or, generally, samples) of unknowns, the system $ S ^ {0} $ may be solved by using algorithms which are generalizations of the Gauss method.

References

[1] C.F. Gauss, "Beiträge zur Theorie der algebraischen Gleichungen" , Werke , 3 , K. Gesellschaft Wissenschaft. Göttingen (1876) pp. 71–102
[2] A.G. Kurosh, "Higher algebra" , MIR (1972) (Translated from Russian)
[3] D.K. Faddeev, V.N. Faddeeva, "Computational methods of linear algebra" , Freeman (1963) (Translated from Russian)
[4] S.N. Chernikov, "Lineare Ungleichungen" , Deutsch. Verlag Wissenschaft. (1971) (Translated from Russian)

Comments

There are a number of variants of this method, mostly based on practical implementation considerations (like the methods of Crout and Doolittle) or efficiency (like the method of Choleski for symmetric systems).

In the Western literature, the notions of LU-decomposition, forward elimination and back substitution are often associated with Gauss' method (which is also called the Gaussian elimination method). Consider the particular case where the matrix of coefficients $ A $ in the system $ A x = a $ is a square matrix $ ( m = n ) $. Then LU-decomposition refers to the decomposition of $ A $ into a lower- and upper-triangular matrix $ L $ and $ U $, i.e., $ A = L U $. Forward elimination and back substitution are understood to be the solution of the triangular systems $ L y = a $ and $ U x = y $, respectively. By using the concept of a flop (as introduced by C.B. Moler), that is, the amount of work involved in performing a floating point addition, a floating point multiplication and a little subscripting, it can be shown that LU-decomposition, forward elimination and back substitution require $ m ^ {3} / 3 $, $ m ^ {2} / 2 $ and $ m ^ {2} / 2 $ flops, respectively.

For numerical purposes it is not sufficient to ensure that the coefficients $ a _ {11} $, etc., the pivots, are just non-zero, but they are the best possible; if the absolutely largest $ a _ {j} $ is used as a pivot, then this is called partial pivoting. If the absolutely largest element in the entire matrix (or submatrix at later stages) is used as a pivot, this is being called complete pivoting. Partial pivoting to make LU-decomposition possible in case the matrix $ A $ has singular leading principal submatrices amounts to interchanging rows. It requires $ O ( m ^ {2} ) $ comparisons. With such a strategy the method can be shown to be numerically stable (see Stability of a computational algorithm; Stability of a computational process).

An excellent book on numerical linear algebra is [a1]. The problem of numerical stability in Gaussian elimination is discussed in [a6].

Fortan routines can be found in [a4]; for an older Algol version see [a5].

References

[a1] G.H. Golub, C.F. van Loan, "Matrix computations" , North Oxford Acad. (1983)
[a2] J.H. Wilkinson, "The algebraic eigenvalue problem" , Clarendon Press (1965)
[a3] G. Strang, "Linear algebra and its application" , Acad. Press (1976)
[a4] H. Dongarra, J.R. Bunch, C.B. Moler, G.W. Stewart, "LINPACK users guide" , SIAM (1978)
[a5] J.H. Wilkinson, C. Reinsch, "Handbook for automatic computation" , 2. Linear algebra , Springer (1971)
[a6] P.A. Bussinger, "Monotoring the numerical stability of Gaussian elimination" Numer. Math. , 16 (1971) pp. 360–361
How to Cite This Entry:
Gauss method. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Gauss_method&oldid=47050
This article was adapted from an original article by S.N. Chernikov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article