Difference between revisions of "Regression matrix"

Latest revision as of 18:22, 18 December 2020

The matrix $ B $ of regression coefficients (cf. Regression coefficient) $ \beta _ {ji} $, $ j = 1 \dots m $, $ i = 1 \dots r $, in a multi-dimensional linear regression model,

$$ \tag{* } X = B Z + \epsilon . $$

Here $ X $ is a matrix with elements $ X _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, where $ X _ {jk} $, $ k = 1 \dots n $, are observations of the $ j $- th component of the original $ m $- dimensional random variable, $ Z $ is a matrix of known regression variables $ z _ {ik} $, $ i = 1 \dots r $, $ k = 1 \dots n $, and $ \epsilon $ is the matrix of errors $ \epsilon _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, with $ {\mathsf E} \epsilon _ {jk} = 0 $. The elements $ \beta _ {ji} $ of the regression matrix $ B $ are unknown and have to be estimated. The model (*) is a generalization to the $ m $- dimensional case of the general linear model of regression analysis.

References

[1]	M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983)

Comments

In econometrics, for example, a frequently used model is that one has $ m $ variables $ y _ {1} \dots y _ {m} $ to be explained (endogenous variables) in terms of $ n $ explanatory variables $ x _ {1} \dots x _ {n} $( exogenous variables) by means of a linear relationship $ y= Ax $. Given $ N $ sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, the matrix of relation coefficients $ A $ is to be estimated. The model is therefore

$$ y _ {t} = A x _ {t} + \epsilon _ {t} . $$

With the assumption that the $ \epsilon _ {t} $ have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:

$$ \widehat{A} = M _ {yx} M _ {xx} ^ {-1} , $$

where $ M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) $, $ M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) $. In the case of a single endogenous variable, $ y = a ^ {T} x $, this can be conveniently written as

$$ \widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y , $$

where $ Y $ is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ and $ X $ is the $ ( N \times n ) $ observation matrix consisting of the rows $ x _ {t} ^ {T} $, $ t = 1 \dots N $. Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.

References

[a1]	E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French)
[a2]	H. Theil, "Principles of econometrics" , North-Holland (1971)

How to Cite This Entry:
Regression matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=11825

This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Regression matrix"

Latest revision as of 18:22, 18 December 2020

References

Comments

References

@@ Line 1: / Line 1: @@
-The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806401.png" /> of regression coefficients (cf. [[Regression coefficient|Regression coefficient]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806402.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806403.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806404.png" />, in a multi-dimensional linear [[Regression|regression]] model,
+<!--
+r0806401.png
+$#A+1 = 46 n = 0
+$#C+1 = 46 : ~/encyclopedia/old_files/data/R080/R.0800640 Regression matrix
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806405.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+{{TEX|auto}}
+{{TEX|done}}
-Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806406.png" /> is a matrix with elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806407.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806408.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806409.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064010.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064011.png" />, are observations of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064012.png" />-th component of the original <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064013.png" />-dimensional random variable, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064014.png" /> is a matrix of known regression variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064015.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064016.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064017.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064018.png" /> is the matrix of errors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064019.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064020.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064021.png" />, with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064022.png" />. The elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064023.png" /> of the regression matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064024.png" /> are unknown and have to be estimated. The model (*) is a generalization to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064025.png" />-dimensional case of the general linear model of [[Regression analysis|regression analysis]].
+The matrix  $  B $
+of regression coefficients (cf. [[Regression coefficient|Regression coefficient]])  $  \beta _ {ji} $,
+$  j = 1 \dots m $,
+$  i = 1 \dots r $,
+in a multi-dimensional linear [[Regression|regression]] model,
+$$ \tag{* }
+X  =  B Z + \epsilon .
+$$
+Here  $  X $
+is a matrix with elements  $  X _ {jk} $,
+$  j = 1 \dots m $,
+$  k = 1 \dots n $,
+where  $  X _ {jk} $,
+$  k = 1 \dots n $,
+are observations of the  $  j $-
+th component of the original  $  m $-
+dimensional random variable,  $  Z $
+is a matrix of known regression variables  $  z _ {ik} $,
+$  i = 1 \dots r $,
+$  k = 1 \dots n $,
+and  $  \epsilon $
+is the matrix of errors  $  \epsilon _ {jk} $,
+$  j = 1 \dots m $,
+$  k = 1 \dots n $,
+with  $  {\mathsf E} \epsilon _ {jk} = 0 $.
+The elements  $  \beta _ {ji} $
+of the regression matrix  $  B $
+are unknown and have to be estimated. The model (*) is a generalization to the  $  m $-
+dimensional case of the general linear model of [[Regression analysis|regression analysis]].
 ====References====
 <table><TR><TD valign="top">[1]</TD> <TD valign="top">  M.G. Kendall,   A. Stuart,   "The advanced theory of statistics" , '''3. Design and analysis, and time series''' , Griffin  (1983)</TD></TR></table>
 ====Comments====
-In econometrics, for example, a frequently used model is that one has <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064026.png" /> variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064027.png" /> to be explained (endogenous variables) in terms of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064028.png" /> explanatory variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064029.png" /> (exogenous variables) by means of a linear relationship <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064030.png" />. Given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064031.png" /> sets of measurements (with errors), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064032.png" />, the matrix of relation coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064033.png" /> is to be estimated. The model is therefore
+In econometrics, for example, a frequently used model is that one has  $  m $
+variables  $  y _ {1} \dots y _ {m} $
+to be explained (endogenous variables) in terms of  $  n $
+explanatory variables  $  x _ {1} \dots x _ {n} $(
+exogenous variables) by means of a linear relationship  $  y= Ax $.
+Given  $  N $
+sets of measurements (with errors),  $  ( y _ {t} , x _ {t} ) $,
+the matrix of relation coefficients  $  A $
+is to be estimated. The model is therefore
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064034.png" /></td> </tr></table>
+$$
+y _ {t}  =  A x _ {t} + \epsilon _ {t} .
+$$
-With the assumption that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064035.png" /> have zero mean and are independently and identically distributed with [[Normal distribution|normal distribution]], that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
+With the assumption that the  $  \epsilon _ {t} $
+have zero mean and are independently and identically distributed with [[Normal distribution|normal distribution]], that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064036.png" /></td> </tr></table>
+$$
+\widehat{A}   =  M _ {yx} M _ {xx}  ^ {-1} ,
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064037.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064038.png" />. In the case of a single endogenous variable, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064039.png" />, this can be conveniently written as
+where  $  M _ {xx} = N  ^ {-1} ( \sum _ {t=1}  ^ {N} x _ {t} x _ {t}  ^ {T} ) $,
+$  M _ {yx} = N  ^ {-1} ( \sum _ {t=1}  ^ {N} y _ {t} x _ {t}  ^ {T} ) $.
+In the case of a single endogenous variable,  $  y = a  ^ {T} x $,
+this can be conveniently written as
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064040.png" /></td> </tr></table>
+$$
+\widehat{a}   =  ( X  ^ {T} X)  ^ {-1} X  ^ {T} Y ,
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064041.png" /> is the column vector of observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064042.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064043.png" /> is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064044.png" /> observation matrix consisting of the rows <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064045.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064046.png" />. Numerous variants and generalizations are considered [[#References|[a1]]], [[#References|[a2]]]; cf. also [[Regression analysis|Regression analysis]].
+where  $  Y $
+is the column vector of observations  $  ( y _ {1} \dots y _ {N} )  ^ {T} $
+and  $  X $
+is the  $  ( N \times n ) $
+observation matrix consisting of the rows  $  x _ {t}  ^ {T} $,
+$  t = 1 \dots N $.
+Numerous variants and generalizations are considered [[#References|[a1]]], [[#References|[a2]]]; cf. also [[Regression analysis|Regression analysis]].
 ====References====
 <table><TR><TD valign="top">[a1]</TD> <TD valign="top">  E. Malinvaud,   "Statistical methods of econometrics" , North-Holland  (1970)  (Translated from French)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  H. Theil,   "Principles of econometrics" , North-Holland  (1971)</TD></TR></table>