Difference between revisions of "Best linear unbiased estimator"

Revision as of 10:58, 29 May 2020

BLUE

Let

$$ \tag{a1 } Y = X \beta + \epsilon $$

be a linear regression model, where $ Y $ is a random column vector of $ n $" measurements" , $ X \in \mathbf R ^ {n \times p } $ is a known non-random "plan" matrix, $ \beta \in \mathbf R ^ {p \times1 } $ is an unknown vector of the parameters, and $ \epsilon $ is a random "error" , or "noise" , vector with mean $ {\mathsf E} \epsilon =0 $ and a possibly unknown non-singular covariance matrix $ V = { \mathop{\rm Var} } ( \epsilon ) $. A model with linear restrictions on $ \beta $ can be obviously reduced to (a1). Without loss of generality, $ { \mathop{\rm rank} } ( X ) = p $.

Let $ K \in \mathbf R ^ {k \times p } $; a linear unbiased estimator (LUE) of $ K \beta $ is a statistical estimator of the form $ MY $ for some non-random matrix $ M \in \mathbf R ^ {k \times n } $ such that $ {\mathsf E} MY = K \beta $ for all $ \beta \in \mathbf R ^ {p \times1 } $, i.e., $ MX = K $. A linear unbiased estimator $ M _ {*} Y $ of $ K \beta $ is called a best linear unbiased estimator (BLUE) of $ K \beta $ if $ { \mathop{\rm Var} } ( M _ {*} Y ) \leq { \mathop{\rm Var} } ( MY ) $ for all linear unbiased estimators $ MY $ of $ K \beta $, i.e., if $ { \mathop{\rm Var} } ( aM _ {*} Y ) \leq { \mathop{\rm Var} } ( aMY ) $ for all linear unbiased estimators $ MY $ of $ K \beta $ and all $ a \in \mathbf R ^ {1 \times k } $.

Since it is assumed that $ { \mathop{\rm rank} } ( X ) = p $, there exists a unique best linear unbiased estimator of $ K \beta $ for any $ K $. It is then given by the formula $ K {\widehat \beta } $, where $ {\widehat \beta } = { {\beta _ {V} } hat } = ( X ^ {T} V ^ {-1 } X ) ^ {-1 } X ^ {T} V ^ {-1 } Y $, which coincides by the Gauss–Markov theorem (cf. Least squares, method of) with the least square estimator of $ \beta $, defined as $ { \mathop{\rm arg} } { \mathop{\rm min} } _ \beta ( Y - X \beta ) ^ {T} V ^ {- 1 } ( Y - X \beta ) $; as usual, $ {} ^ {T} $ stands for transposition.

Because $ V = { \mathop{\rm Var} } ( \epsilon ) $ is normally not known, Yu.A. Rozanov [a2] has suggested to use a "pseudo-best" estimator $ { {\beta _ {W} } hat } $ in place of $ { {\beta _ {V} } hat } $, with an appropriately chosen $ W $. This idea has been further developed by A.M. Samarov [a3] and I.F. Pinelis [a4]. In particular, Pinelis has obtained duality theorems for the minimax risk and equations for the minimax solutions $ V $ assumed to belong to an arbitrary known convex set $ {\mathcal V} $ of positive-definite $ ( n \times n ) $- matrices with respect to the general quadratic risk function of the form

$$ R ( V,W ) = {\mathsf E} _ {V} ( {\widehat \beta } _ {W} - \beta ) ^ {T} S ( {\widehat \beta } _ {W} - \beta ) , $$

$$ V \in {\mathcal V}, W \in {\mathcal V}, $$

where $ S $ is any non-negative-definite $ ( p \times p ) $- matrix and $ {\mathsf E} _ {V} $ stands for the expectation assuming $ { \mathop{\rm Var} } ( \epsilon ) = V $. Asymptotic versions of these results have also been given by Pinelis for the case when the "noise" is a second-order stationary stochastic process with an unknown spectral density belonging to an arbitrary, but known, convex class of spectral densities and by Samarov in the case of contamination classes.

References

[a1]	C.R. Rao, "Linear statistical inference and its applications" , Wiley (1965)
[a2]	Yu.A. Rozanov, "On a new class of estimates" , Multivariate Analysis , 2 , Acad. Press (1969) pp. 437–441
[a3]	A.M. Samarov, "Robust spectral regression" Ann. Math. Stat. , 15 (1987) pp. 99–111
[a4]	I.F. Pinelis, "On the minimax estimation of regression" Th. Probab. Appl. , 35 (1990) pp. 500–512

How to Cite This Entry:
Best linear unbiased estimator. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Best_linear_unbiased_estimator&oldid=12048

This article was adapted from an original article by I. Pinelis (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Best linear unbiased estimator"

Revision as of 10:58, 29 May 2020

References

@@ Line 1: / Line 1: @@
+<!--
+b1104401.png
+$#A+1 = 48 n = 0
+$#C+1 = 48 : ~/encyclopedia/old_files/data/B110/B.1100440 Best linear unbiased estimator,
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
+{{TEX|auto}}
+{{TEX|done}}
 ''BLUE''
 Let
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104401.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+$$ \tag{a1 }
+Y = X \beta + \epsilon
+$$
-be a [[Linear regression|linear regression]] model, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104402.png" /> is a random column vector of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104403.png" />  "measurements" , <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104404.png" /> is a known non-random  "plan"  matrix, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104405.png" /> is an unknown vector of the parameters, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104406.png" /> is a random  "error" , or  "noise" , vector with mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104407.png" /> and a possibly unknown non-singular covariance matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104408.png" />. A model with linear restrictions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b1104409.png" /> can be obviously reduced to (a1). Without loss of generality, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044010.png" />.
+be a [[Linear regression|linear regression]] model, where  $  Y $
+is a random column vector of  $  n $"
+measurements" ,  $  X \in \mathbf R ^ {n \times p } $
+is a known non-random  "plan"  matrix,  $  \beta \in \mathbf R ^ {p \times1 } $
+is an unknown vector of the parameters, and  $  \epsilon $
+is a random  "error" , or  "noise" , vector with mean  $  {\mathsf E} \epsilon =0 $
+and a possibly unknown non-singular covariance matrix  $  V = { \mathop{\rm Var} } ( \epsilon ) $.
+A model with linear restrictions on  $  \beta $
+can be obviously reduced to (a1). Without loss of generality,  $  { \mathop{\rm rank} } ( X ) = p $.
-Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044011.png" />; a linear unbiased estimator (LUE) of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044012.png" /> is a [[Statistical estimator|statistical estimator]] of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044013.png" /> for some non-random matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044014.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044015.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044016.png" />, i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044017.png" />. A linear [[Unbiased estimator|unbiased estimator]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044018.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044019.png" /> is called a best linear unbiased estimator (BLUE) of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044020.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044021.png" /> for all linear unbiased estimators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044022.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044023.png" />, i.e., if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044024.png" /> for all linear unbiased estimators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044025.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044026.png" /> and all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044027.png" />.
+Let  $  K \in \mathbf R ^ {k \times p } $;
+a linear unbiased estimator (LUE) of  $  K \beta $
+is a [[Statistical estimator|statistical estimator]] of the form  $  MY $
+for some non-random matrix  $  M \in \mathbf R ^ {k \times n } $
+such that  $  {\mathsf E} MY = K \beta $
+for all  $  \beta \in \mathbf R ^ {p \times1 } $,
+i.e.,  $  MX = K $.
+A linear [[Unbiased estimator|unbiased estimator]]  $  M _ {*} Y $
+of  $  K \beta $
+is called a best linear unbiased estimator (BLUE) of  $  K \beta $
+if  $  { \mathop{\rm Var} } ( M _ {*} Y ) \leq  { \mathop{\rm Var} } ( MY ) $
+for all linear unbiased estimators  $  MY $
+of  $  K \beta $,
+i.e., if  $  { \mathop{\rm Var} } ( aM _ {*} Y ) \leq  { \mathop{\rm Var} } ( aMY ) $
+for all linear unbiased estimators  $  MY $
+of  $  K \beta $
+and all  $  a \in \mathbf R ^ {1 \times k } $.
-Since it is assumed that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044028.png" />, there exists a unique best linear unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044029.png" /> for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044030.png" />. It is then given by the formula <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044031.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044032.png" />, which coincides by the Gauss–Markov theorem (cf. [[Least squares, method of|Least squares, method of]]) with the least square estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044033.png" />, defined as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044034.png" />; as usual, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044035.png" /> stands for transposition.
+Since it is assumed that  $  { \mathop{\rm rank} } ( X ) = p $,
+there exists a unique best linear unbiased estimator of  $  K \beta $
+for any  $  K $.
+It is then given by the formula  $  K {\widehat \beta   } $,
+where  $  {\widehat \beta   } = { {\beta _ {V} } hat } = ( X  ^ {T} V ^ {-1 } X ) ^ {-1 } X  ^ {T} V ^ {-1 } Y $,
+which coincides by the Gauss–Markov theorem (cf. [[Least squares, method of|Least squares, method of]]) with the least square estimator of  $  \beta $,
+defined as  $  { \mathop{\rm arg} } { \mathop{\rm min} } _  \beta  ( Y - X \beta )  ^ {T} V ^ {- 1 } ( Y - X \beta ) $;
+as usual,  $  {}  ^ {T} $
+stands for transposition.
-Because <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044036.png" /> is normally not known, Yu.A. Rozanov [[#References|[a2]]] has suggested to use a  "pseudo-best"  estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044037.png" /> in place of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044038.png" />, with an appropriately chosen <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044039.png" />. This idea has been further developed by A.M. Samarov [[#References|[a3]]] and I.F. Pinelis [[#References|[a4]]]. In particular, Pinelis has obtained duality theorems for the minimax risk and equations for the minimax solutions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044040.png" /> assumed to belong to an arbitrary known convex set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044041.png" /> of positive-definite <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044042.png" />-matrices with respect to the general quadratic risk function of the form
+Because  $  V = { \mathop{\rm Var} } ( \epsilon ) $
+is normally not known, Yu.A. Rozanov [[#References|[a2]]] has suggested to use a  "pseudo-best"  estimator  $  { {\beta _ {W} } hat } $
+in place of  $  { {\beta _ {V} } hat } $,
+with an appropriately chosen  $  W $.
+This idea has been further developed by A.M. Samarov [[#References|[a3]]] and I.F. Pinelis [[#References|[a4]]]. In particular, Pinelis has obtained duality theorems for the minimax risk and equations for the minimax solutions  $  V $
+assumed to belong to an arbitrary known convex set  $  {\mathcal V} $
+of positive-definite  $  ( n \times n ) $-
+matrices with respect to the general quadratic risk function of the form
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044043.png" /></td> </tr></table>
+$$
+R ( V,W ) = {\mathsf E} _ {V} ( {\widehat \beta   } _ {W} - \beta )  ^ {T} S ( {\widehat \beta   } _ {W} - \beta ) ,
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044044.png" /></td> </tr></table>
+$$
+V \in {\mathcal V},  W \in {\mathcal V},
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044045.png" /> is any non-negative-definite <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044046.png" />-matrix and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044047.png" /> stands for the expectation assuming <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b110/b110440/b11044048.png" />. Asymptotic versions of these results have also been given by Pinelis for the case when the  "noise"  is a second-order stationary stochastic process with an unknown spectral density belonging to an arbitrary, but known, convex class of spectral densities and by Samarov in the case of contamination classes.
+where  $  S $
+is any non-negative-definite  $  ( p \times p ) $-
+matrix and  $  {\mathsf E} _ {V} $
+stands for the expectation assuming  $  { \mathop{\rm Var} } ( \epsilon ) = V $.
+Asymptotic versions of these results have also been given by Pinelis for the case when the  "noise"  is a second-order stationary stochastic process with an unknown spectral density belonging to an arbitrary, but known, convex class of spectral densities and by Samarov in the case of contamination classes.
 ====References====
 <table><TR><TD valign="top">[a1]</TD> <TD valign="top">  C.R. Rao,   "Linear statistical inference and its applications" , Wiley  (1965)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  Yu.A. Rozanov,   "On a new class of estimates" , ''Multivariate Analysis'' , '''2''' , Acad. Press  (1969)  pp. 437–441</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  A.M. Samarov,   "Robust spectral regression"  ''Ann. Math. Stat.'' , '''15'''  (1987)  pp. 99–111</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  I.F. Pinelis,   "On the minimax estimation of regression"  ''Th. Probab. Appl.'' , '''35'''  (1990)  pp. 500–512</TD></TR></table>