Namespaces
Variants
Actions

Linear regression

From Encyclopedia of Mathematics
Jump to: navigation, search


of one random variable on another \mathbf X = ( X ^ {(1)} \dots X ^ {(p)} ) ^ \prime

An m - dimensional vector form, linear in \mathbf x , supposed to be the conditional mean (given \mathbf X = \mathbf x ) of the random vector \mathbf Y . The corresponding equations

\tag{* } y ^ {(k)} ( \mathbf x , \mathbf b ) = {\mathsf E} ( Y ^ {(k)} \mid \mathbf X = \mathbf x ) = \ \sum_{j=0}^ { p } b _ {kj} x ^ {(j)} ,

x ^ {(0)} \equiv 1 ,\ k = 1 \dots m,

are called the linear regression equations of \mathbf Y on \mathbf X , and the parameters b _ {kj} are called the regression coefficients (see also Regression), \mathbf X is an observable parameter (not necessarily random), on which the mean of the resulting function (response) \mathbf Y ( \mathbf X ) under investigation depends.

In addition, the linear regression of Y ^ {(k)} on \mathbf X is frequently also understood to be the "best" (in a well-defined sense) linear approximation of Y ^ {(k)} by means of \mathbf X , or even the result of the best (in a well-defined sense) smoothing of a system of experimental points ( "observations" ) ( Y _ {i} ^ {(k)} , \mathbf X _ {i} ) , i = 1 \dots n , by means of a hyperplane in the space ( Y ^ {(k)} , \mathbf X ) , in situations when the interpretation of these points as samples from a corresponding general population need not be allowable. With such a definition one has to distinguish different versions of linear regression, depending on the choice of the method of computing the errors of the linear approximation of Y ^ {(k)} by means of \mathbf X ( or depending on the actual choice of a criterion for the amount of smoothing). The most widespread criteria for the quality of the approximation of Y ^ {(k)} by means of linear combinations of \mathbf X ( linear smoothing of the points ( Y _ {i} ^ {(k)} , \mathbf X _ {i} ) ) are:

Q _ {1} ( \mathbf b ) = {\mathsf E} \left \{ \omega ^ {2} ( \mathbf X ) \cdot \left ( Y ^ {(k)} ( \mathbf X ) - \sum _ {j=0}^ { p } b _ {kj} X ^ {(j)} \right ) ^ {2} \right \} ,

\widetilde{Q} _ {1} ( \mathbf b ) = \sum_{i=1}^ { n } \omega _ {i} ^ {2} \left ( Y _ {i} ^ {(k)} - \sum_{j=0}^ { p } b _ {kj} X _ {i} ^ {(j)} \right ) ^ {2} ,

Q _ {2} ( \mathbf b ) = {\mathsf E} \left \{ \omega ( \mathbf X ) \left | Y ^ {(k)} ( \mathbf X ) - \sum_{j=0}^ { p } b _ {kj} X ^ {(j)} \right | \right \} ,

\widetilde{Q} _ {2} ( \mathbf b ) = \sum_{j=1}^ { n } \omega _ {i} \left | Y _ {i} ^ {(k)} - \sum_{j=0}^ { p } b _ {kj} X _ {i} ^ {(j)} \right | ,

Q _ {3} ( \mathbf b ) = {\mathsf E} \left \{ \omega ^ {2} ( \mathbf X ) \cdot \rho ^ {2} \left ( Y ^ {(k)} ( \mathbf X ) , \sum_{j=0}^ { p } b _ {kj} X ^ {(j)} \right ) \right \} ,

\widetilde{Q} _ {3} ( \mathbf b ) = \sum_{i=1}^ { n } \omega _ {i} ^ {2} \cdot \rho ^ {2} \left ( Y _ {i} ^ {(k)} , \sum_{j=0}^ { p } b _ {kj} X _ {i} ^ {(j)} \right ) .

In these relations the choice of "weights" \omega ( \mathbf X ) or \omega _ {i} depends on the nature of the actual scheme under investigation. For example, if the Y ^ {(k)} ( \mathbf X ) are interpreted as random variables with known variances {\mathsf D} Y ^ {(k)} ( \mathbf X ) ( or with known estimates of them), then \omega ^ {2} ( \mathbf X ) = [ {\mathsf D} Y ^ {(k)} ( \mathbf X ) ] ^ {-} 1 . In the last two criteria the "discrepancies" of the approximation or the smoothing are measured by the distances \rho ( \cdot , \cdot ) from Y ^ {(k)} ( \mathbf X ) or Y _ {i} ^ {(k)} to the required hyperplane of regression. If the coefficients b _ {kj} are determined by minimizing the quantities Q _ {1} ( \mathbf b ) or \widetilde{Q} _ {1} ( \mathbf b ) , then the linear regression is said to be least squares or L _ {2} ; if the criteria Q _ {2} ( \mathbf b ) and \widetilde{Q} _ {2} ( \mathbf b ) are used, the linear regression is said to be minimal absolute deviations or L _ {1} ; if the criteria Q _ {3} ( \mathbf b ) and \widetilde{Q} _ {3} ( \mathbf b ) are used, it is said to be minimum \rho - distance.

In certain cases, linear regression in the classical sense (*) is the same as linear regression defined by using functionals of the type Q _ {i} . Thus, if the vector ( \mathbf X ^ \prime , Y ^ {(k)}] ) is subject to a multi-dimensional normal law, then the regression of Y ^ {(k)} on \mathbf X in the sense of (*) is linear and is the same as least squares or minimum mean squares linear regression (for \omega ( \mathbf X ) \equiv 1 ).

References

[1] Yu.V. Linnik, "Methode der kleinste Quadraten in moderner Darstellung" , Deutsch. Verlag Wissenschaft. (1961) (Translated from Russian)
[2] H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)
[3] M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 2. Inference and relationship , Macmillan (1979)
[4] C.R. Rao, "Linear statistical inference and its applications" , Wiley (1965)
How to Cite This Entry:
Linear regression. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Linear_regression&oldid=55042
This article was adapted from an original article by S.A. Aivazyan (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article