Difference between revisions of "Regression matrix"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
− | + | <!-- | |
+ | r0806401.png | ||
+ | $#A+1 = 46 n = 0 | ||
+ | $#C+1 = 46 : ~/encyclopedia/old_files/data/R080/R.0800640 Regression matrix | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
− | + | {{TEX|auto}} | |
+ | {{TEX|done}} | ||
− | + | The matrix $ B $ | |
+ | of regression coefficients (cf. [[Regression coefficient|Regression coefficient]]) $ \beta _ {ji} $, | ||
+ | $ j = 1 \dots m $, | ||
+ | $ i = 1 \dots r $, | ||
+ | in a multi-dimensional linear [[Regression|regression]] model, | ||
+ | |||
+ | $$ \tag{* } | ||
+ | X = B Z + \epsilon . | ||
+ | $$ | ||
+ | |||
+ | Here $ X $ | ||
+ | is a matrix with elements $ X _ {jk} $, | ||
+ | $ j = 1 \dots m $, | ||
+ | $ k = 1 \dots n $, | ||
+ | where $ X _ {jk} $, | ||
+ | $ k = 1 \dots n $, | ||
+ | are observations of the $ j $- | ||
+ | th component of the original $ m $- | ||
+ | dimensional random variable, $ Z $ | ||
+ | is a matrix of known regression variables $ z _ {ik} $, | ||
+ | $ i = 1 \dots r $, | ||
+ | $ k = 1 \dots n $, | ||
+ | and $ \epsilon $ | ||
+ | is the matrix of errors $ \epsilon _ {jk} $, | ||
+ | $ j = 1 \dots m $, | ||
+ | $ k = 1 \dots n $, | ||
+ | with $ {\mathsf E} \epsilon _ {jk} = 0 $. | ||
+ | The elements $ \beta _ {ji} $ | ||
+ | of the regression matrix $ B $ | ||
+ | are unknown and have to be estimated. The model (*) is a generalization to the $ m $- | ||
+ | dimensional case of the general linear model of [[Regression analysis|regression analysis]]. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> M.G. Kendall, A. Stuart, "The advanced theory of statistics" , '''3. Design and analysis, and time series''' , Griffin (1983)</TD></TR></table> | <table><TR><TD valign="top">[1]</TD> <TD valign="top"> M.G. Kendall, A. Stuart, "The advanced theory of statistics" , '''3. Design and analysis, and time series''' , Griffin (1983)</TD></TR></table> | ||
− | |||
− | |||
====Comments==== | ====Comments==== | ||
− | In econometrics, for example, a frequently used model is that one has | + | In econometrics, for example, a frequently used model is that one has $ m $ |
+ | variables $ y _ {1} \dots y _ {m} $ | ||
+ | to be explained (endogenous variables) in terms of $ n $ | ||
+ | explanatory variables $ x _ {1} \dots x _ {n} $( | ||
+ | exogenous variables) by means of a linear relationship $ y= Ax $. | ||
+ | Given $ N $ | ||
+ | sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, | ||
+ | the matrix of relation coefficients $ A $ | ||
+ | is to be estimated. The model is therefore | ||
− | + | $$ | |
+ | y _ {t} = A x _ {t} + \epsilon _ {t} . | ||
+ | $$ | ||
− | With the assumption that the | + | With the assumption that the $ \epsilon _ {t} $ |
+ | have zero mean and are independently and identically distributed with [[Normal distribution|normal distribution]], that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator: | ||
− | + | $$ | |
+ | \widehat{A} = M _ {yx} M _ {xx} ^ {-} 1 , | ||
+ | $$ | ||
− | where | + | where $ M _ {xx} = N ^ {-} 1 ( \sum _ {t=} 1 ^ {N} x _ {t} x _ {t} ^ {T} ) $, |
+ | $ M _ {yx} = N ^ {-} 1 ( \sum _ {t=} 1 ^ {N} y _ {t} x _ {t} ^ {T} ) $. | ||
+ | In the case of a single endogenous variable, $ y = a ^ {T} x $, | ||
+ | this can be conveniently written as | ||
− | + | $$ | |
+ | \widehat{a} = ( X ^ {T} X) ^ {-} 1 X ^ {T} Y , | ||
+ | $$ | ||
− | where | + | where $ Y $ |
+ | is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ | ||
+ | and $ X $ | ||
+ | is the $ ( N \times n ) $ | ||
+ | observation matrix consisting of the rows $ x _ {t} ^ {T} $, | ||
+ | $ t = 1 \dots N $. | ||
+ | Numerous variants and generalizations are considered [[#References|[a1]]], [[#References|[a2]]]; cf. also [[Regression analysis|Regression analysis]]. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> H. Theil, "Principles of econometrics" , North-Holland (1971)</TD></TR></table> | <table><TR><TD valign="top">[a1]</TD> <TD valign="top"> E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> H. Theil, "Principles of econometrics" , North-Holland (1971)</TD></TR></table> |
Revision as of 08:10, 6 June 2020
The matrix $ B $
of regression coefficients (cf. Regression coefficient) $ \beta _ {ji} $,
$ j = 1 \dots m $,
$ i = 1 \dots r $,
in a multi-dimensional linear regression model,
$$ \tag{* } X = B Z + \epsilon . $$
Here $ X $ is a matrix with elements $ X _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, where $ X _ {jk} $, $ k = 1 \dots n $, are observations of the $ j $- th component of the original $ m $- dimensional random variable, $ Z $ is a matrix of known regression variables $ z _ {ik} $, $ i = 1 \dots r $, $ k = 1 \dots n $, and $ \epsilon $ is the matrix of errors $ \epsilon _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, with $ {\mathsf E} \epsilon _ {jk} = 0 $. The elements $ \beta _ {ji} $ of the regression matrix $ B $ are unknown and have to be estimated. The model (*) is a generalization to the $ m $- dimensional case of the general linear model of regression analysis.
References
[1] | M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983) |
Comments
In econometrics, for example, a frequently used model is that one has $ m $ variables $ y _ {1} \dots y _ {m} $ to be explained (endogenous variables) in terms of $ n $ explanatory variables $ x _ {1} \dots x _ {n} $( exogenous variables) by means of a linear relationship $ y= Ax $. Given $ N $ sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, the matrix of relation coefficients $ A $ is to be estimated. The model is therefore
$$ y _ {t} = A x _ {t} + \epsilon _ {t} . $$
With the assumption that the $ \epsilon _ {t} $ have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
$$ \widehat{A} = M _ {yx} M _ {xx} ^ {-} 1 , $$
where $ M _ {xx} = N ^ {-} 1 ( \sum _ {t=} 1 ^ {N} x _ {t} x _ {t} ^ {T} ) $, $ M _ {yx} = N ^ {-} 1 ( \sum _ {t=} 1 ^ {N} y _ {t} x _ {t} ^ {T} ) $. In the case of a single endogenous variable, $ y = a ^ {T} x $, this can be conveniently written as
$$ \widehat{a} = ( X ^ {T} X) ^ {-} 1 X ^ {T} Y , $$
where $ Y $ is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ and $ X $ is the $ ( N \times n ) $ observation matrix consisting of the rows $ x _ {t} ^ {T} $, $ t = 1 \dots N $. Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.
References
[a1] | E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French) |
[a2] | H. Theil, "Principles of econometrics" , North-Holland (1971) |
Regression matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=48475