# Regression matrix

The matrix $ B $
of regression coefficients (cf. Regression coefficient) $ \beta _ {ji} $,
$ j = 1 \dots m $,
$ i = 1 \dots r $,
in a multi-dimensional linear regression model,

$$ \tag{* } X = B Z + \epsilon . $$

Here $ X $ is a matrix with elements $ X _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, where $ X _ {jk} $, $ k = 1 \dots n $, are observations of the $ j $- th component of the original $ m $- dimensional random variable, $ Z $ is a matrix of known regression variables $ z _ {ik} $, $ i = 1 \dots r $, $ k = 1 \dots n $, and $ \epsilon $ is the matrix of errors $ \epsilon _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, with $ {\mathsf E} \epsilon _ {jk} = 0 $. The elements $ \beta _ {ji} $ of the regression matrix $ B $ are unknown and have to be estimated. The model (*) is a generalization to the $ m $- dimensional case of the general linear model of regression analysis.

#### References

[1] | M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983) |

#### Comments

In econometrics, for example, a frequently used model is that one has $ m $ variables $ y _ {1} \dots y _ {m} $ to be explained (endogenous variables) in terms of $ n $ explanatory variables $ x _ {1} \dots x _ {n} $( exogenous variables) by means of a linear relationship $ y= Ax $. Given $ N $ sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, the matrix of relation coefficients $ A $ is to be estimated. The model is therefore

$$ y _ {t} = A x _ {t} + \epsilon _ {t} . $$

With the assumption that the $ \epsilon _ {t} $ have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:

$$ \widehat{A} = M _ {yx} M _ {xx} ^ {-1} , $$

where $ M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) $, $ M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) $. In the case of a single endogenous variable, $ y = a ^ {T} x $, this can be conveniently written as

$$ \widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y , $$

where $ Y $ is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ and $ X $ is the $ ( N \times n ) $ observation matrix consisting of the rows $ x _ {t} ^ {T} $, $ t = 1 \dots N $. Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.

#### References

[a1] | E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French) |

[a2] | H. Theil, "Principles of econometrics" , North-Holland (1971) |

**How to Cite This Entry:**

Regression matrix.

*Encyclopedia of Mathematics.*URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=51014