Difference between revisions of "Regression matrix"
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
m (fix tex) |
||
Line 65: | Line 65: | ||
$$ | $$ | ||
− | \widehat{A} = M _ {yx} M _ {xx} ^ {-} | + | \widehat{A} = M _ {yx} M _ {xx} ^ {-1} , |
$$ | $$ | ||
− | where $ M _ {xx} = N ^ {-} | + | where $ M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) $, |
− | $ M _ {yx} = N ^ {-} | + | $ M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) $. |
In the case of a single endogenous variable, , | In the case of a single endogenous variable, y = a ^ {T} x , | ||
this can be conveniently written as | this can be conveniently written as | ||
$$ | $$ | ||
− | \widehat{a} = ( X ^ {T} X) ^ {-} | + | \widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y , |
$$ | $$ | ||
Latest revision as of 18:22, 18 December 2020
The matrix B
of regression coefficients (cf. Regression coefficient) \beta _ {ji} ,
j = 1 \dots m ,
i = 1 \dots r ,
in a multi-dimensional linear regression model,
\tag{* } X = B Z + \epsilon .
Here X is a matrix with elements X _ {jk} , j = 1 \dots m , k = 1 \dots n , where X _ {jk} , k = 1 \dots n , are observations of the j - th component of the original m - dimensional random variable, Z is a matrix of known regression variables z _ {ik} , i = 1 \dots r , k = 1 \dots n , and \epsilon is the matrix of errors \epsilon _ {jk} , j = 1 \dots m , k = 1 \dots n , with {\mathsf E} \epsilon _ {jk} = 0 . The elements \beta _ {ji} of the regression matrix B are unknown and have to be estimated. The model (*) is a generalization to the m - dimensional case of the general linear model of regression analysis.
References
[1] | M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983) |
Comments
In econometrics, for example, a frequently used model is that one has m variables y _ {1} \dots y _ {m} to be explained (endogenous variables) in terms of n explanatory variables x _ {1} \dots x _ {n} ( exogenous variables) by means of a linear relationship y= Ax . Given N sets of measurements (with errors), ( y _ {t} , x _ {t} ) , the matrix of relation coefficients A is to be estimated. The model is therefore
y _ {t} = A x _ {t} + \epsilon _ {t} .
With the assumption that the \epsilon _ {t} have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
\widehat{A} = M _ {yx} M _ {xx} ^ {-1} ,
where M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) , M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) . In the case of a single endogenous variable, y = a ^ {T} x , this can be conveniently written as
\widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y ,
where Y is the column vector of observations ( y _ {1} \dots y _ {N} ) ^ {T} and X is the ( N \times n ) observation matrix consisting of the rows x _ {t} ^ {T} , t = 1 \dots N . Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.
References
[a1] | E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French) |
[a2] | H. Theil, "Principles of econometrics" , North-Holland (1971) |
Regression matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=48475