Namespaces
Variants
Actions

Difference between revisions of "Unbiased estimator"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (fixing spaces)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
A [[Statistical estimator|statistical estimator]] whose expectation is that of the quantity to be estimated. Suppose that in the realization of a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950701.png" /> taking values in a probability space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950702.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950703.png" />, a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950704.png" /> has to be estimated, mapping the parameter set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950705.png" /> into a certain set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950706.png" />, and that as an estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950707.png" /> a statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950708.png" /> is chosen. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u0950709.png" /> is such that
+
<!--
 +
u0950701.png
 +
$#A+1 = 163 n = 0
 +
$#C+1 = 163 : ~/encyclopedia/old_files/data/U095/U.0905070 Unbiased estimator
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507010.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
holds for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507011.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507012.png" /> is called an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507013.png" />. An unbiased estimator is frequently called free of systematic errors.
+
A [[Statistical estimator|statistical estimator]] whose expectation is that of the quantity to be estimated. Suppose that in the realization of a random variable  $  X $
 +
taking values in a probability space  $  ( \mathfrak X , \mathfrak B , {\mathsf P} _  \theta  ) $,
 +
$  \theta \in \Theta $,
 +
a function  $  f :  \Theta \rightarrow \Omega $
 +
has to be estimated, mapping the parameter set  $  \Theta $
 +
into a certain set  $  \Omega $,
 +
and that as an estimator of  $  f ( \theta ) $
 +
a statistic  $  T = T ( X) $
 +
is chosen. If  $  T $
 +
is such that
 +
 
 +
$$
 +
{\mathsf E} _  \theta  \{ T \}  = \
 +
\int\limits _ {\mathfrak X } T ( x)  d {\mathsf P} _  \theta  ( x)  = f ( \theta )
 +
$$
 +
 
 +
holds for  $  \theta \in \Theta $,  
 +
then $  T $
 +
is called an unbiased estimator of $  f ( \theta ) $.  
 +
An unbiased estimator is frequently called free of systematic errors.
  
 
===Example 1.===
 
===Example 1.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507014.png" /> be random variables having the same expectation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507015.png" />, that is,
+
Let $  X _ {1}, \dots, X _ {n} $
 +
be random variables having the same expectation $  \theta $,  
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507016.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} \{ X _ {1} \}  = \dots = {\mathsf E} \{ X _ {n} \}  = \theta .
 +
$$
  
 
In that case the statistic
 
In that case the statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507017.png" /></td> </tr></table>
+
$$
 +
= c _ {1} X _ {1} + \dots + c _ {n} X _ {n} ,\ \
 +
c _ {1} + \dots + c _ {n} = 1 ,
 +
$$
  
is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507018.png" />. In particular, the arithmetic mean of the observations, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507019.png" />, is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507020.png" />. In this example <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507021.png" />.
+
is an unbiased estimator of $  \theta $.  
 +
In particular, the arithmetic mean of the observations, $  \overline{X} = ( X _ {1} + \dots + X _ {n} ) / n $,  
 +
is an unbiased estimator of $  \theta $.  
 +
In this example $  f ( \theta ) \equiv \theta $.
  
 
===Example 2.===
 
===Example 2.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507022.png" /> be independent random variables having the same probability law with distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507023.png" />, that is,
+
Let $  X _ {1}, \dots, X _ {n} $
 +
be independent random variables having the same probability law with distribution function $  F ( x) $,  
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507024.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} \{ X _ {i} < x \}  = F ( x) ,\  | x | < \infty ,\ \
 +
i = 1 \dots n .
 +
$$
  
In this case the empirical distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507025.png" /> constructed from the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507026.png" /> is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507027.png" />, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507028.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507029.png" />.
+
In this case the empirical distribution function $  F _ {n} ( x) $
 +
constructed from the observations $  X _ {1} \dots X _ {n} $
 +
is an unbiased estimator of $  F ( x) $,  
 +
that is, $  {\mathsf E} \{ F _ {n} ( x) \} = F ( x) $,  
 +
$  | x | < \infty $.
  
 
===Example 3.===
 
===Example 3.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507030.png" /> be an unbiased estimator of a parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507031.png" />, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507032.png" />, and assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507033.png" /> is a linear function. In that case the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507034.png" /> is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507035.png" />.
+
Let $  T = T ( X) $
 +
be an unbiased estimator of a parameter $  \theta $,  
 +
that is, $  {\mathsf E} \{ T \} = \theta $,  
 +
and assume that $  f ( \theta ) = a \theta + b $
 +
is a linear function. In that case the statistic $  a T + b $
 +
is an unbiased estimator of $  f ( \theta ) $.
  
 
The next example shows that there are cases in which unbiased estimators exist and are even unique, but they may turn out to be useless.
 
The next example shows that there are cases in which unbiased estimators exist and are even unique, but they may turn out to be useless.
  
 
===Example 4.===
 
===Example 4.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507036.png" /> be a random variable subject to the geometric distribution with parameter of success <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507037.png" />, that is, for any natural number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507038.png" />,
+
Let $  X $
 +
be a random variable subject to the geometric distribution with parameter of success $  \theta $,  
 +
that is, for any natural number $  k $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507039.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} \{ X = k \mid  \theta \}  = \theta ( 1 - \theta )
 +
^ {k- 1} ,\  0 \leq  \theta \leq  1 .
 +
$$
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507040.png" /> is an unbiased estimator of the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507041.png" />, it must satisfy the unbiasedness equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507042.png" />, that is,
+
If $  T = T ( X) $
 +
is an unbiased estimator of the parameter $  \theta $,  
 +
it must satisfy the unbiasedness equation $  {\mathsf E} \{ T \} = \theta $,  
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507043.png" /></td> </tr></table>
+
$$
 +
\sum _ { k= 1} ^  \infty 
 +
T ( k) \theta ( 1 - \theta )  ^ {k- 1}  = \theta .
 +
$$
  
 
The unique solution of this equation is
 
The unique solution of this equation is
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507044.png" /></td> </tr></table>
+
$$
 +
T ( X)  = \
 +
\left \{
 +
\begin{array}{ll}
 +
1  & \textrm{ if }  X = 1 ,  \\
 +
0 & \textrm{ if }  X \geq  2 . \\
 +
\end{array}
 +
 
 +
\right .$$
  
Evidently, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507045.png" /> is good only when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507046.png" /> is very close to 1 or 0, otherwise <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507047.png" /> carries no useful information on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507048.png" />.
+
Evidently, $  T $
 +
is good only when $  \theta $
 +
is very close to 1 or 0, otherwise $  T $
 +
carries no useful information on $  \theta $.
  
 
===Example 5.===
 
===Example 5.===
Suppose that a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507049.png" /> has the binomial law with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507050.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507051.png" />, that is, for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507052.png" />,
+
Suppose that a random variable $  X $
 +
has the binomial law with parameters $  n $
 +
and $  \theta $,  
 +
that is, for any $  k = 0 \dots n $,
 +
 
 +
$$
 +
{\mathsf P} \{ X = k \mid  n , \theta \}  = \
 +
\left ( \begin{array}{c}
 +
n \\
 +
k
 +
\end{array}
 +
\right )
 +
\theta  ^ {k} ( 1 - \theta )  ^ {n- k },\  0 < \theta < 1 .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507053.png" /></td> </tr></table>
+
It is known that the best unbiased estimator of the parameter  $  \theta $ (in the sense of minimum quadratic risk) is the statistic  $  T = X / n $.
 +
Nevertheless, if  $  \theta $
 +
is irrational,  $  {\mathsf P} \{ T = \theta \} = 0 $.
 +
This example reflects a general property of random variables that, generally speaking, a random variable need not take values that agree with its expectation. And finally, cases are possible when unbiased estimators do not exist at all. Thus, if under the conditions of Example 5 one takes as the function to be estimated  $  f ( \theta ) = 1 / \theta $,
 +
then (see Example 6) there is no unbiased estimator  $  T ( X) $
 +
for  $  1 / \theta $.
  
It is known that the best unbiased estimator of the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507054.png" /> (in the sense of minimum quadratic risk) is the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507055.png" />. Nevertheless, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507056.png" /> is irrational, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507057.png" />. This example reflects a general property of random variables that, generally speaking, a random variable need not take values that agree with its expectation. And finally, cases are possible when unbiased estimators do not exist at all. Thus, if under the conditions of Example 5 one takes as the function to be estimated <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507058.png" />, then (see Example 6) there is no unbiased estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507059.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507060.png" />.
+
The preceding examples demonstrate that the concept of an unbiased estimator in its very nature does not necessarily help an experimenter to avoid all the complications that arise in the construction of statistical estimators, since an unbiased estimator may turn out to be very good and even totally useless; it may not be unique or may not exist at all. Moreover, an unbiased estimator, like every point estimator, also has the following deficiency. It only gives an approximate value for the true value of the quantity to be estimated; this quantity was not known before the experiment and remains unknown after it has been performed. So, in the problem of constructing statistical point estimators there is no serious justification for the fact that in all cases they should produce the resulting unbiased estimator, unless it is assumed that the study of unbiased estimators leads to a simple priority theory. For example, the [[Rao–Cramér inequality|Rao–Cramér inequality]] has a simple form for unbiased estimators. Namely, if  $  T = T ( X) $
 +
is an unbiased estimator for a function  $  f ( \theta ) $,
 +
then under fairly broad conditions of regularity on the family  $  \{ {\mathsf P} _  \theta  \} $
 +
and the function  $  f ( \theta ) $,
 +
the Rao–Cramér inequality implies that
  
The preceding examples demonstrate that the concept of an unbiased estimator in its very nature does not necessarily help an experimenter to avoid all the complications that arise in the construction of statistical estimators, since an unbiased estimator may turn out to be very good and even totally useless; it may not be unique or may not exist at all. Moreover, an unbiased estimator, like every point estimator, also has the following deficiency. It only gives an approximate value for the true value of the quantity to be estimated; this quantity was not known before the experiment and remains unknown after it has been performed. So, in the problem of constructing statistical point estimators there is no serious justification for the fact that in all cases they should produce the resulting unbiased estimator, unless it is assumed that the study of unbiased estimators leads to a simple priority theory. For example, the [[Rao–Cramér inequality|Rao–Cramér inequality]] has a simple form for unbiased estimators. Namely, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507061.png" /> is an unbiased estimator for a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507062.png" />, then under fairly broad conditions of regularity on the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507063.png" /> and the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507064.png" />, the Rao–Cramér inequality implies that
+
$$ \tag{1 }
 +
{\mathsf D} \{ T \}  = \
 +
{\mathsf E} \{ | T - f ( \theta ) | ^ {2} \}
 +
\geq 
 +
\frac{1}{I ( \theta ) }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507065.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
f ^ { \prime } ( \theta ) ^ {2} ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507066.png" /> is the [[Fisher amount of information|Fisher amount of information]] for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507067.png" />. Thus, there is a lower bound for the variance of an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507068.png" />, namely, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507069.png" />. In particular, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507070.png" />, then it follows from (1) that
+
where $  I ( \theta ) $
 +
is the [[Fisher amount of information|Fisher amount of information]] for $  \theta $.  
 +
Thus, there is a lower bound for the variance of an unbiased estimator of $  f ( \theta ) $,  
 +
namely, $  f ^ { \prime } ( \theta ) / I ( \theta ) $.  
 +
In particular, if $  f ( \theta ) \equiv \theta $,  
 +
then it follows from (1) that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507071.png" /></td> </tr></table>
+
$$
 +
{\mathsf D} \{ T \}  \geq 
 +
\frac{1}{I ( \theta ) }
 +
.
 +
$$
  
A statistical estimator for which equality is attained in the Rao–Cramér inequality is called efficient (cf. [[Efficient estimator|Efficient estimator]]). Thus, the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507072.png" /> in Example 5 is an efficient unbiased estimator of the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507073.png" /> of the binomial law, since
+
A statistical estimator for which equality is attained in the Rao–Cramér inequality is called efficient (cf. [[Efficient estimator|Efficient estimator]]). Thus, the statistic $  T = X / n $
 +
in Example 5 is an efficient unbiased estimator of the parameter $  \theta $
 +
of the binomial law, since
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507074.png" /></td> </tr></table>
+
$$
 +
{\mathsf D} \{ T \}  =
 +
\frac{1}{n}
 +
\theta ( 1 - \theta )
 +
$$
  
 
and
 
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507075.png" /></td> </tr></table>
+
$$
 +
I ( \theta )  = {\mathsf E}
 +
\left \{ \left [
  
that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507076.png" /> is the best point estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507077.png" /> in the sense of minimum quadratic risk in the class of all unbiased estimators.
+
\frac \partial {\partial  \theta }
  
Naturally, an experimenter is interested in the case when the class of unbiased estimators is rich enough to allow the choice of the best unbiased estimator in some sense. In this context an important role is played by the [[Rao–Blackwell–Kolmogorov theorem|Rao–Blackwell–Kolmogorov theorem]], which allows one to construct an unbiased estimator of minimal variance. This theorem asserts that if the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507078.png" /> has a [[Sufficient statistic|sufficient statistic]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507079.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507080.png" /> is an arbitrary unbiased estimator of a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507081.png" />, then the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507082.png" /> obtained by averaging <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507083.png" /> over the fixed sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507084.png" /> has a risk not exceeding that of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507085.png" /> relative to any convex loss function for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507086.png" />. If the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507087.png" /> is complete, the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507088.png" /> is uniquely determined. That is, the Rao–Blackwell–Kolmogorov theorem implies that unbiased estimators must be looked for in terms of sufficient statistics, if they exist. The practical value of the Rao–Blackwell–Kolmogorov theorem lies in the fact that it gives a recipe for constructing best unbiased estimators, namely: One has to construct an arbitrary unbiased estimator and then average it over a sufficient statistic.
+
\mathop{\rm log} [ \theta  ^ {X}
 +
( 1 - \theta )  ^ {n- X}
 +
\right ]  ^ {2} \right \}  = \
 +
 
 +
\frac{n}{\theta ( 1 - \theta ) }
 +
,
 +
$$
 +
 
 +
that is,  $  T = X / n $
 +
is the best point estimator of  $  \theta $
 +
in the sense of minimum quadratic risk in the class of all unbiased estimators.
 +
 
 +
Naturally, an experimenter is interested in the case when the class of unbiased estimators is rich enough to allow the choice of the best unbiased estimator in some sense. In this context an important role is played by the [[Rao–Blackwell–Kolmogorov theorem|Rao–Blackwell–Kolmogorov theorem]], which allows one to construct an unbiased estimator of minimal variance. This theorem asserts that if the family $  \{ {\mathsf P} _  \theta  \} $
 +
has a [[Sufficient statistic|sufficient statistic]] $  \psi = \psi ( X) $
 +
and $  T = T ( X) $
 +
is an arbitrary unbiased estimator of a function $  f ( \theta ) $,  
 +
then the statistic $  T  ^ {*} = {\mathsf E} _  \theta  \{ T \mid  \psi \} $
 +
obtained by averaging $  T $
 +
over the fixed sufficient statistic $  \psi $
 +
has a risk not exceeding that of $  T $
 +
relative to any convex loss function for all $  \theta \in \Theta $.  
 +
If the family $  \{ {\mathsf P} _  \theta  \} $
 +
is complete, the statistic $  T  ^ {*} $
 +
is uniquely determined. That is, the Rao–Blackwell–Kolmogorov theorem implies that unbiased estimators must be looked for in terms of sufficient statistics, if they exist. The practical value of the Rao–Blackwell–Kolmogorov theorem lies in the fact that it gives a recipe for constructing best unbiased estimators, namely: One has to construct an arbitrary unbiased estimator and then average it over a sufficient statistic.
  
 
===Example 6.===
 
===Example 6.===
Suppose that a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507089.png" /> has the Pascal distribution (a negative binomial distribution) with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507090.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507091.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507092.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507093.png" />); that is,
+
Suppose that a random variable $  X $
 +
has the Pascal distribution (a negative binomial distribution) with parameters $  r $
 +
and $  \theta $ ($  r \geq  2 $,  
 +
0 \leq  \theta \leq  1 $);  
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507094.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} \{ X = k \mid  r , \theta \}  = \
 +
\left ( \begin{array}{c}
 +
r + k - 1 \\
 +
k
 +
\end{array}
 +
\right )
 +
\theta  ^ {r} ( 1 - \theta )  ^ {k} ,\ \
 +
k = r , r + 1 ,\dots .
 +
$$
  
In this case the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507095.png" /> is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507096.png" />. Since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507097.png" /> is expressed in terms of the sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507098.png" /> and the system of functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u09507099.png" /> is complete on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070100.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070101.png" /> is the only unbiased estimator and, consequently, the best estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070102.png" />.
+
In this case the statistic $  T = ( r - 1 ) / ( X - 1 ) $
 +
is an unbiased estimator of $  \theta $.  
 +
Since $  T $
 +
is expressed in terms of the sufficient statistic $  X $
 +
and the system of functions $  1 , x , x  ^ {2}, \dots $
 +
is complete on $  [ 0 , 1 ] $,
 +
$  T $
 +
is the only unbiased estimator and, consequently, the best estimator of $  \theta $.
  
 
===Example 7.===
 
===Example 7.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070103.png" /> be a random variable having the binomial law with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070104.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070105.png" />. The generating function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070106.png" /> of this law can be expressed by the formula
+
Let $  X $
 +
be a random variable having the binomial law with parameters $  n $
 +
and $  \theta $.  
 +
The generating function $  Q( z) $
 +
of this law can be expressed by the formula
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070107.png" /></td> </tr></table>
+
$$
 +
Q ( z)  = {\mathsf E} \{ z  ^ {X} \}  = \
 +
( z \theta + q )  ^ {n} ,\ \
 +
q = 1 - \theta ,
 +
$$
  
which implies that for any integer <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070108.png" />, the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070109.png" />-th derivative
+
which implies that for any integer $  k = 1, \dots, n $,  
 +
the $  k $-th derivative
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070110.png" /></td> </tr></table>
+
$$
 +
Q  ^ {( k)} ( z)  = \
 +
n ( n - 1 ) \dots ( n - k + 1 )
 +
( z \theta + q ) ^ {n - k } \theta  ^ {k\ } =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070111.png" /></td> </tr></table>
+
$$
 +
= \
 +
n  ^ {[ k]} ( z \theta + q ) ^ {n - k } \theta  ^ {k} .
 +
$$
  
 
On the other hand,
 
On the other hand,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070112.png" /></td> </tr></table>
+
$$
 +
Q  ^ {( k)} ( 1)  = \
 +
{\mathsf E} [ X ( X - 1 ) \dots ( X - k + 1 ) ]  = {\mathsf E}
 +
\{ X  ^ {[ k]} \} .
 +
$$
  
 
Hence,
 
Hence,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070113.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} \left \{
 +
 
 +
\frac{1}{n  ^ {[ k]} }
 +
X  ^ {[ k]}
 +
\right \}  = \theta  ^ {k} ,
 +
$$
  
 
that is, the statistic
 
that is, the statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070114.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
T _ {k} ( X) = \
  
is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070115.png" />, and since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070116.png" /> is expressed in terms of the sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070117.png" /> and the system of functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070118.png" /> is complete on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070119.png" />, it follows that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070120.png" /> is the only, hence the best, unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070121.png" />.
+
\frac{1}{n  ^ {[ k] }}
 +
X  ^ {[ k]}
 +
$$
  
In connection with this example the following question arises: What functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070122.png" /> of the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070123.png" /> admit an unbiased estimator? A.N. Kolmogorov [[#References|[1]]] has shown that this only happens for polynomials of degree <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070124.png" />. Thus, if
+
is an unbiased estimator of  $  \theta  ^ {k} $,
 +
and since  $  T _ {k} ( X) $
 +
is expressed in terms of the sufficient statistic  $  X $
 +
and the system of functions  $  1 , x , x  ^ {2}, \dots $
 +
is complete on  $  [ 0 , 1 ] $,
 +
it follows that $  T _ {k} ( X) $
 +
is the only, hence the best, unbiased estimator of $  \theta  ^ {k} $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070125.png" /></td> </tr></table>
+
In connection with this example the following question arises: What functions  $  f ( \theta ) $
 +
of the parameter  $  \theta $
 +
admit an unbiased estimator? A.N. Kolmogorov [[#References|[1]]] has shown that this only happens for polynomials of degree  $  m \leq  n $.  
 +
Thus, if
 +
 
 +
$$
 +
f ( \theta )  =  a _ {0} +
 +
a _ {1} \theta + \dots + a _ {m} \theta  ^ {m} ,\ \
 +
1 \leq  m \leq  n ,
 +
$$
  
 
then it follows from (2) that the statistic
 
then it follows from (2) that the statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070126.png" /></td> </tr></table>
+
$$
 +
= a _ {0} +
 +
\sum _ { k= 1} ^ { m }  a _ {k} T _ {k} ( X)
 +
$$
  
is the only unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070127.png" />. This result implies, in particular, that there is no unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070128.png" />.
+
is the only unbiased estimator of $  f ( \theta ) $.  
 +
This result implies, in particular, that there is no unbiased estimator of $  f ( \theta ) = 1 / \theta $.
  
 
===Example 8.===
 
===Example 8.===
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070129.png" /> be a random variable subject to the Poisson law with parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070130.png" />; that is, for any integer <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070131.png" />
+
Let $  X $
 +
be a random variable subject to the Poisson law with parameter $  \theta $;  
 +
that is, for any integer $  k = 0 , 1, \dots $
 +
 
 +
$$
 +
{\mathsf P} \{ X = k \mid  \theta \}  = \
 +
 
 +
\frac{\theta  ^ {k} }{k!}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070132.png" /></td> </tr></table>
+
e ^ {- \theta } ,\ \
 +
\theta > 0 .
 +
$$
  
Since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070133.png" />, the observation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070134.png" /> by itself is an unbiased estimator of its mathematical expectation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070135.png" />. In turn, an unbiased estimator of, say, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070136.png" /> is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070137.png" />. More generally, the statistic
+
Since $  {\mathsf E} \{ X \} = \theta $,  
 +
the observation of $  X $
 +
by itself is an unbiased estimator of its mathematical expectation $  \theta $.  
 +
In turn, an unbiased estimator of, say, $  f ( \theta ) = \theta  ^ {2} $
 +
is $  X ( X - 1 ) $.  
 +
More generally, the statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070138.png" /></td> </tr></table>
+
$$
 +
X  ^ {[ r]}  = X ( X - 1 ) \dots ( X - r + 1 ) ,\  r = 1 , 2, \dots
 +
$$
  
is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070139.png" />. This fact implies, in particular, that the statistic
+
is an unbiased estimator of $  f ( \theta ) = \theta  ^ {r} $.  
 +
This fact implies, in particular, that the statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070140.png" /></td> </tr></table>
+
$$
 +
T ( X)  = 1 +
 +
\sum _ { r= 1} ^  \infty 
 +
( - 1 )  ^ {r} ( X)  ^ {[ r]}
 +
$$
  
is an unbiased estimator of the function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070141.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070142.png" />. Quite generally, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070143.png" /> admits an unbiased estimator, then the unbiasedness equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070144.png" /> must hold for it, which is equivalent to
+
is an unbiased estimator of the function $  f ( \theta ) = ( 1 + \theta )  ^ {- 1} $,
 +
0 < \theta < 1 $.  
 +
Quite generally, if $  f ( \theta ) $
 +
admits an unbiased estimator, then the unbiasedness equation $  {\mathsf E} \{ T ( X) \} = f ( \theta ) $
 +
must hold for it, which is equivalent to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070145.png" /></td> </tr></table>
+
$$
 +
\sum _ { k= 0} ^  \infty 
 +
T ( k)
 +
\frac{\theta  ^ {k} }{k!}
 +
e ^ {- \theta }
 +
= f ( \theta ) .
 +
$$
  
From this one deduces that an unbiased estimator exists for any function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070146.png" /> that admits a power series expansion in its domain of definition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070147.png" />.
+
From this one deduces that an unbiased estimator exists for any function $  f ( \theta ) $
 +
that admits a power series expansion in its domain of definition $  \Theta \subset  \mathbf R _ {1}  ^ {+} $.
  
 
===Example 9.===
 
===Example 9.===
Suppose that the independent random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070148.png" /> have the same Poisson law with parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070149.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070150.png" />. The generating function of this law, which can be expressed by the formula
+
Suppose that the independent random variables $  X _ {1}, \dots, X _ {n} $
 +
have the same Poisson law with parameter $  \theta $,
 +
$  \theta > 0 $.  
 +
The generating function of this law, which can be expressed by the formula
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070151.png" /></td> </tr></table>
+
$$
 +
g _ {z} ( \theta )  =   \mathop{\rm exp} \{ \theta ( z - 1 ) \} ,
 +
$$
  
is an entire analytic function and hence has a unique unbiased estimator. In this case a sufficient statistic is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070152.png" />, which has the Poisson law with parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070153.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070154.png" /> is an unbiased estimator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070155.png" />, then it must satisfy the unbiasedness equation
+
is an entire analytic function and hence has a unique unbiased estimator. In this case a sufficient statistic is $  X = X _ {1} + \dots + X _ {n} $,  
 +
which has the Poisson law with parameter $  n \theta $.  
 +
If $  T ( X) $
 +
is an unbiased estimator of $  g _ {z} ( \theta ) $,  
 +
then it must satisfy the unbiasedness equation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070156.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} _ {0} \{ T ( X) \}  = g _ {z} ( \theta )  = \
 +
e ^ {\theta ( z- 1) } ,
 +
$$
  
 
which implies that
 
which implies that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070157.png" /></td> </tr></table>
+
$$
 +
T ( X)  = \
 +
\left \{
 +
\begin{array}{ll}
 +
\left ( \begin{array}{c}
 +
X \\
 +
k
 +
\end{array}
 +
\right )
 +
\left (
 +
\frac{1}{n}
 +
\right )  ^ {k} \left ( 1 -
 +
\frac{1}{n}
 +
\right )  ^ {X- k} ,  & 0 \leq  k \leq  X ,  \\
 +
0,  & \textrm{ otherwise } ; \\
 +
\end{array}
 +
\right.
 +
$$
  
that is, an unbiased estimator of the generating function of the Poisson law is the generating function of the binomial law with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070158.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070159.png" />.
+
that is, an unbiased estimator of the generating function of the Poisson law is the generating function of the binomial law with parameters $  X $
 +
and $  1 / n $.
  
Examples 6–9 demonstrate that in certain cases, which occur quite frequently in practice, the problem of constructing best estimators is easily solvable, provided that one restricts attention to the class of unbiased estimators. Kolmogorov [[#References|[1]]] has considered the problem of constructing unbiased estimators, in particular, for the distribution function of a normal law with unknown parameters. A more general definition of an unbiased estimator is due to E. Lehmann [[#References|[2]]], according to whom a statistical estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070160.png" /> of a parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070161.png" /> is called unbiased relative to a loss function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070162.png" /> if
+
Examples 6–9 demonstrate that in certain cases, which occur quite frequently in practice, the problem of constructing best estimators is easily solvable, provided that one restricts attention to the class of unbiased estimators. Kolmogorov [[#References|[1]]] has considered the problem of constructing unbiased estimators, in particular, for the distribution function of a normal law with unknown parameters. A more general definition of an unbiased estimator is due to E. Lehmann [[#References|[2]]], according to whom a statistical estimator $  T = T ( X) $
 +
of a parameter $  \theta $
 +
is called unbiased relative to a loss function $  L ( \theta , T ) $
 +
if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u095/u095070/u095070163.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} _  \theta  \{ L ( \theta  ^  \prime  , T( X) ) \}
 +
\geq  {\mathsf E} _  \theta  \{ L ( \theta , T ( X) ) \} \ \
 +
\textrm{ for  all  } \
 +
\theta , \theta  ^  \prime  \in \Theta .
 +
$$
  
 
There is also a modification of this definition (see [[#References|[3]]]). Yu.V. Linnik and his students (see [[#References|[4]]]) have established that under fairly wide assumptions the best unbiased estimator is independent of the loss function.
 
There is also a modification of this definition (see [[#References|[3]]]). Yu.V. Linnik and his students (see [[#References|[4]]]) have established that under fairly wide assumptions the best unbiased estimator is independent of the loss function.

Latest revision as of 06:40, 10 May 2022


A statistical estimator whose expectation is that of the quantity to be estimated. Suppose that in the realization of a random variable $ X $ taking values in a probability space $ ( \mathfrak X , \mathfrak B , {\mathsf P} _ \theta ) $, $ \theta \in \Theta $, a function $ f : \Theta \rightarrow \Omega $ has to be estimated, mapping the parameter set $ \Theta $ into a certain set $ \Omega $, and that as an estimator of $ f ( \theta ) $ a statistic $ T = T ( X) $ is chosen. If $ T $ is such that

$$ {\mathsf E} _ \theta \{ T \} = \ \int\limits _ {\mathfrak X } T ( x) d {\mathsf P} _ \theta ( x) = f ( \theta ) $$

holds for $ \theta \in \Theta $, then $ T $ is called an unbiased estimator of $ f ( \theta ) $. An unbiased estimator is frequently called free of systematic errors.

Example 1.

Let $ X _ {1}, \dots, X _ {n} $ be random variables having the same expectation $ \theta $, that is,

$$ {\mathsf E} \{ X _ {1} \} = \dots = {\mathsf E} \{ X _ {n} \} = \theta . $$

In that case the statistic

$$ T = c _ {1} X _ {1} + \dots + c _ {n} X _ {n} ,\ \ c _ {1} + \dots + c _ {n} = 1 , $$

is an unbiased estimator of $ \theta $. In particular, the arithmetic mean of the observations, $ \overline{X} = ( X _ {1} + \dots + X _ {n} ) / n $, is an unbiased estimator of $ \theta $. In this example $ f ( \theta ) \equiv \theta $.

Example 2.

Let $ X _ {1}, \dots, X _ {n} $ be independent random variables having the same probability law with distribution function $ F ( x) $, that is,

$$ {\mathsf P} \{ X _ {i} < x \} = F ( x) ,\ | x | < \infty ,\ \ i = 1 \dots n . $$

In this case the empirical distribution function $ F _ {n} ( x) $ constructed from the observations $ X _ {1} \dots X _ {n} $ is an unbiased estimator of $ F ( x) $, that is, $ {\mathsf E} \{ F _ {n} ( x) \} = F ( x) $, $ | x | < \infty $.

Example 3.

Let $ T = T ( X) $ be an unbiased estimator of a parameter $ \theta $, that is, $ {\mathsf E} \{ T \} = \theta $, and assume that $ f ( \theta ) = a \theta + b $ is a linear function. In that case the statistic $ a T + b $ is an unbiased estimator of $ f ( \theta ) $.

The next example shows that there are cases in which unbiased estimators exist and are even unique, but they may turn out to be useless.

Example 4.

Let $ X $ be a random variable subject to the geometric distribution with parameter of success $ \theta $, that is, for any natural number $ k $,

$$ {\mathsf P} \{ X = k \mid \theta \} = \theta ( 1 - \theta ) ^ {k- 1} ,\ 0 \leq \theta \leq 1 . $$

If $ T = T ( X) $ is an unbiased estimator of the parameter $ \theta $, it must satisfy the unbiasedness equation $ {\mathsf E} \{ T \} = \theta $, that is,

$$ \sum _ { k= 1} ^ \infty T ( k) \theta ( 1 - \theta ) ^ {k- 1} = \theta . $$

The unique solution of this equation is

$$ T ( X) = \ \left \{ \begin{array}{ll} 1 & \textrm{ if } X = 1 , \\ 0 & \textrm{ if } X \geq 2 . \\ \end{array} \right .$$

Evidently, $ T $ is good only when $ \theta $ is very close to 1 or 0, otherwise $ T $ carries no useful information on $ \theta $.

Example 5.

Suppose that a random variable $ X $ has the binomial law with parameters $ n $ and $ \theta $, that is, for any $ k = 0 \dots n $,

$$ {\mathsf P} \{ X = k \mid n , \theta \} = \ \left ( \begin{array}{c} n \\ k \end{array} \right ) \theta ^ {k} ( 1 - \theta ) ^ {n- k },\ 0 < \theta < 1 . $$

It is known that the best unbiased estimator of the parameter $ \theta $ (in the sense of minimum quadratic risk) is the statistic $ T = X / n $. Nevertheless, if $ \theta $ is irrational, $ {\mathsf P} \{ T = \theta \} = 0 $. This example reflects a general property of random variables that, generally speaking, a random variable need not take values that agree with its expectation. And finally, cases are possible when unbiased estimators do not exist at all. Thus, if under the conditions of Example 5 one takes as the function to be estimated $ f ( \theta ) = 1 / \theta $, then (see Example 6) there is no unbiased estimator $ T ( X) $ for $ 1 / \theta $.

The preceding examples demonstrate that the concept of an unbiased estimator in its very nature does not necessarily help an experimenter to avoid all the complications that arise in the construction of statistical estimators, since an unbiased estimator may turn out to be very good and even totally useless; it may not be unique or may not exist at all. Moreover, an unbiased estimator, like every point estimator, also has the following deficiency. It only gives an approximate value for the true value of the quantity to be estimated; this quantity was not known before the experiment and remains unknown after it has been performed. So, in the problem of constructing statistical point estimators there is no serious justification for the fact that in all cases they should produce the resulting unbiased estimator, unless it is assumed that the study of unbiased estimators leads to a simple priority theory. For example, the Rao–Cramér inequality has a simple form for unbiased estimators. Namely, if $ T = T ( X) $ is an unbiased estimator for a function $ f ( \theta ) $, then under fairly broad conditions of regularity on the family $ \{ {\mathsf P} _ \theta \} $ and the function $ f ( \theta ) $, the Rao–Cramér inequality implies that

$$ \tag{1 } {\mathsf D} \{ T \} = \ {\mathsf E} \{ | T - f ( \theta ) | ^ {2} \} \geq \frac{1}{I ( \theta ) } f ^ { \prime } ( \theta ) ^ {2} , $$

where $ I ( \theta ) $ is the Fisher amount of information for $ \theta $. Thus, there is a lower bound for the variance of an unbiased estimator of $ f ( \theta ) $, namely, $ f ^ { \prime } ( \theta ) / I ( \theta ) $. In particular, if $ f ( \theta ) \equiv \theta $, then it follows from (1) that

$$ {\mathsf D} \{ T \} \geq \frac{1}{I ( \theta ) } . $$

A statistical estimator for which equality is attained in the Rao–Cramér inequality is called efficient (cf. Efficient estimator). Thus, the statistic $ T = X / n $ in Example 5 is an efficient unbiased estimator of the parameter $ \theta $ of the binomial law, since

$$ {\mathsf D} \{ T \} = \frac{1}{n} \theta ( 1 - \theta ) $$

and

$$ I ( \theta ) = {\mathsf E} \left \{ \left [ \frac \partial {\partial \theta } \mathop{\rm log} [ \theta ^ {X} ( 1 - \theta ) ^ {n- X} \right ] ^ {2} \right \} = \ \frac{n}{\theta ( 1 - \theta ) } , $$

that is, $ T = X / n $ is the best point estimator of $ \theta $ in the sense of minimum quadratic risk in the class of all unbiased estimators.

Naturally, an experimenter is interested in the case when the class of unbiased estimators is rich enough to allow the choice of the best unbiased estimator in some sense. In this context an important role is played by the Rao–Blackwell–Kolmogorov theorem, which allows one to construct an unbiased estimator of minimal variance. This theorem asserts that if the family $ \{ {\mathsf P} _ \theta \} $ has a sufficient statistic $ \psi = \psi ( X) $ and $ T = T ( X) $ is an arbitrary unbiased estimator of a function $ f ( \theta ) $, then the statistic $ T ^ {*} = {\mathsf E} _ \theta \{ T \mid \psi \} $ obtained by averaging $ T $ over the fixed sufficient statistic $ \psi $ has a risk not exceeding that of $ T $ relative to any convex loss function for all $ \theta \in \Theta $. If the family $ \{ {\mathsf P} _ \theta \} $ is complete, the statistic $ T ^ {*} $ is uniquely determined. That is, the Rao–Blackwell–Kolmogorov theorem implies that unbiased estimators must be looked for in terms of sufficient statistics, if they exist. The practical value of the Rao–Blackwell–Kolmogorov theorem lies in the fact that it gives a recipe for constructing best unbiased estimators, namely: One has to construct an arbitrary unbiased estimator and then average it over a sufficient statistic.

Example 6.

Suppose that a random variable $ X $ has the Pascal distribution (a negative binomial distribution) with parameters $ r $ and $ \theta $ ($ r \geq 2 $, $ 0 \leq \theta \leq 1 $); that is,

$$ {\mathsf P} \{ X = k \mid r , \theta \} = \ \left ( \begin{array}{c} r + k - 1 \\ k \end{array} \right ) \theta ^ {r} ( 1 - \theta ) ^ {k} ,\ \ k = r , r + 1 ,\dots . $$

In this case the statistic $ T = ( r - 1 ) / ( X - 1 ) $ is an unbiased estimator of $ \theta $. Since $ T $ is expressed in terms of the sufficient statistic $ X $ and the system of functions $ 1 , x , x ^ {2}, \dots $ is complete on $ [ 0 , 1 ] $, $ T $ is the only unbiased estimator and, consequently, the best estimator of $ \theta $.

Example 7.

Let $ X $ be a random variable having the binomial law with parameters $ n $ and $ \theta $. The generating function $ Q( z) $ of this law can be expressed by the formula

$$ Q ( z) = {\mathsf E} \{ z ^ {X} \} = \ ( z \theta + q ) ^ {n} ,\ \ q = 1 - \theta , $$

which implies that for any integer $ k = 1, \dots, n $, the $ k $-th derivative

$$ Q ^ {( k)} ( z) = \ n ( n - 1 ) \dots ( n - k + 1 ) ( z \theta + q ) ^ {n - k } \theta ^ {k\ } = $$

$$ = \ n ^ {[ k]} ( z \theta + q ) ^ {n - k } \theta ^ {k} . $$

On the other hand,

$$ Q ^ {( k)} ( 1) = \ {\mathsf E} [ X ( X - 1 ) \dots ( X - k + 1 ) ] = {\mathsf E} \{ X ^ {[ k]} \} . $$

Hence,

$$ {\mathsf E} \left \{ \frac{1}{n ^ {[ k]} } X ^ {[ k]} \right \} = \theta ^ {k} , $$

that is, the statistic

$$ \tag{2 } T _ {k} ( X) = \ \frac{1}{n ^ {[ k] }} X ^ {[ k]} $$

is an unbiased estimator of $ \theta ^ {k} $, and since $ T _ {k} ( X) $ is expressed in terms of the sufficient statistic $ X $ and the system of functions $ 1 , x , x ^ {2}, \dots $ is complete on $ [ 0 , 1 ] $, it follows that $ T _ {k} ( X) $ is the only, hence the best, unbiased estimator of $ \theta ^ {k} $.

In connection with this example the following question arises: What functions $ f ( \theta ) $ of the parameter $ \theta $ admit an unbiased estimator? A.N. Kolmogorov [1] has shown that this only happens for polynomials of degree $ m \leq n $. Thus, if

$$ f ( \theta ) = a _ {0} + a _ {1} \theta + \dots + a _ {m} \theta ^ {m} ,\ \ 1 \leq m \leq n , $$

then it follows from (2) that the statistic

$$ T = a _ {0} + \sum _ { k= 1} ^ { m } a _ {k} T _ {k} ( X) $$

is the only unbiased estimator of $ f ( \theta ) $. This result implies, in particular, that there is no unbiased estimator of $ f ( \theta ) = 1 / \theta $.

Example 8.

Let $ X $ be a random variable subject to the Poisson law with parameter $ \theta $; that is, for any integer $ k = 0 , 1, \dots $

$$ {\mathsf P} \{ X = k \mid \theta \} = \ \frac{\theta ^ {k} }{k!} e ^ {- \theta } ,\ \ \theta > 0 . $$

Since $ {\mathsf E} \{ X \} = \theta $, the observation of $ X $ by itself is an unbiased estimator of its mathematical expectation $ \theta $. In turn, an unbiased estimator of, say, $ f ( \theta ) = \theta ^ {2} $ is $ X ( X - 1 ) $. More generally, the statistic

$$ X ^ {[ r]} = X ( X - 1 ) \dots ( X - r + 1 ) ,\ r = 1 , 2, \dots $$

is an unbiased estimator of $ f ( \theta ) = \theta ^ {r} $. This fact implies, in particular, that the statistic

$$ T ( X) = 1 + \sum _ { r= 1} ^ \infty ( - 1 ) ^ {r} ( X) ^ {[ r]} $$

is an unbiased estimator of the function $ f ( \theta ) = ( 1 + \theta ) ^ {- 1} $, $ 0 < \theta < 1 $. Quite generally, if $ f ( \theta ) $ admits an unbiased estimator, then the unbiasedness equation $ {\mathsf E} \{ T ( X) \} = f ( \theta ) $ must hold for it, which is equivalent to

$$ \sum _ { k= 0} ^ \infty T ( k) \frac{\theta ^ {k} }{k!} e ^ {- \theta } = f ( \theta ) . $$

From this one deduces that an unbiased estimator exists for any function $ f ( \theta ) $ that admits a power series expansion in its domain of definition $ \Theta \subset \mathbf R _ {1} ^ {+} $.

Example 9.

Suppose that the independent random variables $ X _ {1}, \dots, X _ {n} $ have the same Poisson law with parameter $ \theta $, $ \theta > 0 $. The generating function of this law, which can be expressed by the formula

$$ g _ {z} ( \theta ) = \mathop{\rm exp} \{ \theta ( z - 1 ) \} , $$

is an entire analytic function and hence has a unique unbiased estimator. In this case a sufficient statistic is $ X = X _ {1} + \dots + X _ {n} $, which has the Poisson law with parameter $ n \theta $. If $ T ( X) $ is an unbiased estimator of $ g _ {z} ( \theta ) $, then it must satisfy the unbiasedness equation

$$ {\mathsf E} _ {0} \{ T ( X) \} = g _ {z} ( \theta ) = \ e ^ {\theta ( z- 1) } , $$

which implies that

$$ T ( X) = \ \left \{ \begin{array}{ll} \left ( \begin{array}{c} X \\ k \end{array} \right ) \left ( \frac{1}{n} \right ) ^ {k} \left ( 1 - \frac{1}{n} \right ) ^ {X- k} , & 0 \leq k \leq X , \\ 0, & \textrm{ otherwise } ; \\ \end{array} \right. $$

that is, an unbiased estimator of the generating function of the Poisson law is the generating function of the binomial law with parameters $ X $ and $ 1 / n $.

Examples 6–9 demonstrate that in certain cases, which occur quite frequently in practice, the problem of constructing best estimators is easily solvable, provided that one restricts attention to the class of unbiased estimators. Kolmogorov [1] has considered the problem of constructing unbiased estimators, in particular, for the distribution function of a normal law with unknown parameters. A more general definition of an unbiased estimator is due to E. Lehmann [2], according to whom a statistical estimator $ T = T ( X) $ of a parameter $ \theta $ is called unbiased relative to a loss function $ L ( \theta , T ) $ if

$$ {\mathsf E} _ \theta \{ L ( \theta ^ \prime , T( X) ) \} \geq {\mathsf E} _ \theta \{ L ( \theta , T ( X) ) \} \ \ \textrm{ for all } \ \theta , \theta ^ \prime \in \Theta . $$

There is also a modification of this definition (see [3]). Yu.V. Linnik and his students (see [4]) have established that under fairly wide assumptions the best unbiased estimator is independent of the loss function.

References

[1] A.N. Kolmogorov, "Unbiased estimates" Izv. Akad. Nauk SSSR Ser. Mat. , 14 : 4 (1950) pp. 303–326 (In Russian)
[2] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1959)
[3] L.B. Klebanov, "A general definition of unbiasedness" Theor. Probab. Appl. , 21 : 3 (1976) pp. 571–585 Teor. Veroyatnost. i. Primenen. , 21 : 3 (1976) pp. 584–598
[4] L.B. Klebanov, Yu.V. Linnik, A.L. Rukhin, "Unbiased estimation and matrix loss functions" Soviet Math. Dokl. , 12 : 5 (1971) pp. 1526–1528 Dokl. Akad. Nauk SSSR , 200 : 5 (1971) pp. 1024–1025
[5] S. Zacks, "The theory of statistical inference" , Wiley (1971)
How to Cite This Entry:
Unbiased estimator. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Unbiased_estimator&oldid=11208
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article