Namespaces
Variants
Actions

Difference between revisions of "Regression matrix"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (fix tex)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806401.png" /> of regression coefficients (cf. [[Regression coefficient|Regression coefficient]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806402.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806403.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806404.png" />, in a multi-dimensional linear [[Regression|regression]] model,
+
<!--
 +
r0806401.png
 +
$#A+1 = 46 n = 0
 +
$#C+1 = 46 : ~/encyclopedia/old_files/data/R080/R.0800640 Regression matrix
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806405.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806406.png" /> is a matrix with elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806407.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806408.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r0806409.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064010.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064011.png" />, are observations of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064012.png" />-th component of the original <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064013.png" />-dimensional random variable, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064014.png" /> is a matrix of known regression variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064015.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064016.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064017.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064018.png" /> is the matrix of errors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064019.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064020.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064021.png" />, with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064022.png" />. The elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064023.png" /> of the regression matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064024.png" /> are unknown and have to be estimated. The model (*) is a generalization to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064025.png" />-dimensional case of the general linear model of [[Regression analysis|regression analysis]].
+
The matrix  $  B $
 +
of regression coefficients (cf. [[Regression coefficient|Regression coefficient]])  $  \beta _ {ji} $,
 +
$  j = 1 \dots m $,
 +
$  i = 1 \dots r $,
 +
in a multi-dimensional linear [[Regression|regression]] model,
 +
 
 +
$$ \tag{* }
 +
= B Z + \epsilon .
 +
$$
 +
 
 +
Here  $  X $
 +
is a matrix with elements $  X _ {jk} $,  
 +
$  j = 1 \dots m $,  
 +
$  k = 1 \dots n $,  
 +
where $  X _ {jk} $,  
 +
$  k = 1 \dots n $,  
 +
are observations of the $  j $-
 +
th component of the original $  m $-
 +
dimensional random variable, $  Z $
 +
is a matrix of known regression variables $  z _ {ik} $,  
 +
$  i = 1 \dots r $,  
 +
$  k = 1 \dots n $,  
 +
and $  \epsilon $
 +
is the matrix of errors $  \epsilon _ {jk} $,  
 +
$  j = 1 \dots m $,  
 +
$  k = 1 \dots n $,  
 +
with $  {\mathsf E} \epsilon _ {jk} = 0 $.  
 +
The elements $  \beta _ {ji} $
 +
of the regression matrix $  B $
 +
are unknown and have to be estimated. The model (*) is a generalization to the $  m $-
 +
dimensional case of the general linear model of [[Regression analysis|regression analysis]].
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M.G. Kendall,  A. Stuart,  "The advanced theory of statistics" , '''3. Design and analysis, and time series''' , Griffin  (1983)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M.G. Kendall,  A. Stuart,  "The advanced theory of statistics" , '''3. Design and analysis, and time series''' , Griffin  (1983)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====
In econometrics, for example, a frequently used model is that one has <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064026.png" /> variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064027.png" /> to be explained (endogenous variables) in terms of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064028.png" /> explanatory variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064029.png" /> (exogenous variables) by means of a linear relationship <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064030.png" />. Given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064031.png" /> sets of measurements (with errors), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064032.png" />, the matrix of relation coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064033.png" /> is to be estimated. The model is therefore
+
In econometrics, for example, a frequently used model is that one has $  m $
 +
variables $  y _ {1} \dots y _ {m} $
 +
to be explained (endogenous variables) in terms of $  n $
 +
explanatory variables $  x _ {1} \dots x _ {n} $(
 +
exogenous variables) by means of a linear relationship $  y= Ax $.  
 +
Given $  N $
 +
sets of measurements (with errors), $  ( y _ {t} , x _ {t} ) $,  
 +
the matrix of relation coefficients $  A $
 +
is to be estimated. The model is therefore
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064034.png" /></td> </tr></table>
+
$$
 +
y _ {t}  = A x _ {t} + \epsilon _ {t} .
 +
$$
  
With the assumption that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064035.png" /> have zero mean and are independently and identically distributed with [[Normal distribution|normal distribution]], that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
+
With the assumption that the $  \epsilon _ {t} $
 +
have zero mean and are independently and identically distributed with [[Normal distribution|normal distribution]], that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064036.png" /></td> </tr></table>
+
$$
 +
\widehat{A}  = M _ {yx} M _ {xx}  ^ {-1} ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064037.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064038.png" />. In the case of a single endogenous variable, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064039.png" />, this can be conveniently written as
+
where $  M _ {xx} = N  ^ {-1} ( \sum _ {t=1}  ^ {N} x _ {t} x _ {t}  ^ {T} ) $,  
 +
$  M _ {yx} = N  ^ {-1} ( \sum _ {t=1}  ^ {N} y _ {t} x _ {t}  ^ {T} ) $.  
 +
In the case of a single endogenous variable, $  y = a  ^ {T} x $,  
 +
this can be conveniently written as
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064040.png" /></td> </tr></table>
+
$$
 +
\widehat{a}  = ( X  ^ {T} X)  ^ {-1} X  ^ {T} Y ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064041.png" /> is the column vector of observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064042.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064043.png" /> is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064044.png" /> observation matrix consisting of the rows <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064045.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r080/r080640/r08064046.png" />. Numerous variants and generalizations are considered [[#References|[a1]]], [[#References|[a2]]]; cf. also [[Regression analysis|Regression analysis]].
+
where $  Y $
 +
is the column vector of observations $  ( y _ {1} \dots y _ {N} )  ^ {T} $
 +
and $  X $
 +
is the $  ( N \times n ) $
 +
observation matrix consisting of the rows $  x _ {t}  ^ {T} $,  
 +
$  t = 1 \dots N $.  
 +
Numerous variants and generalizations are considered [[#References|[a1]]], [[#References|[a2]]]; cf. also [[Regression analysis|Regression analysis]].
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  E. Malinvaud,  "Statistical methods of econometrics" , North-Holland  (1970)  (Translated from French)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  H. Theil,  "Principles of econometrics" , North-Holland  (1971)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  E. Malinvaud,  "Statistical methods of econometrics" , North-Holland  (1970)  (Translated from French)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  H. Theil,  "Principles of econometrics" , North-Holland  (1971)</TD></TR></table>

Latest revision as of 18:22, 18 December 2020


The matrix $ B $ of regression coefficients (cf. Regression coefficient) $ \beta _ {ji} $, $ j = 1 \dots m $, $ i = 1 \dots r $, in a multi-dimensional linear regression model,

$$ \tag{* } X = B Z + \epsilon . $$

Here $ X $ is a matrix with elements $ X _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, where $ X _ {jk} $, $ k = 1 \dots n $, are observations of the $ j $- th component of the original $ m $- dimensional random variable, $ Z $ is a matrix of known regression variables $ z _ {ik} $, $ i = 1 \dots r $, $ k = 1 \dots n $, and $ \epsilon $ is the matrix of errors $ \epsilon _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, with $ {\mathsf E} \epsilon _ {jk} = 0 $. The elements $ \beta _ {ji} $ of the regression matrix $ B $ are unknown and have to be estimated. The model (*) is a generalization to the $ m $- dimensional case of the general linear model of regression analysis.

References

[1] M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983)

Comments

In econometrics, for example, a frequently used model is that one has $ m $ variables $ y _ {1} \dots y _ {m} $ to be explained (endogenous variables) in terms of $ n $ explanatory variables $ x _ {1} \dots x _ {n} $( exogenous variables) by means of a linear relationship $ y= Ax $. Given $ N $ sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, the matrix of relation coefficients $ A $ is to be estimated. The model is therefore

$$ y _ {t} = A x _ {t} + \epsilon _ {t} . $$

With the assumption that the $ \epsilon _ {t} $ have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:

$$ \widehat{A} = M _ {yx} M _ {xx} ^ {-1} , $$

where $ M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) $, $ M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) $. In the case of a single endogenous variable, $ y = a ^ {T} x $, this can be conveniently written as

$$ \widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y , $$

where $ Y $ is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ and $ X $ is the $ ( N \times n ) $ observation matrix consisting of the rows $ x _ {t} ^ {T} $, $ t = 1 \dots N $. Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.

References

[a1] E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French)
[a2] H. Theil, "Principles of econometrics" , North-Holland (1971)
How to Cite This Entry:
Regression matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=11825
This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article