# Unbiased estimator

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

A statistical estimator whose expectation is that of the quantity to be estimated. Suppose that in the realization of a random variable $X$ taking values in a probability space $( \mathfrak X , \mathfrak B , {\mathsf P} _ \theta )$, $\theta \in \Theta$, a function $f : \Theta \rightarrow \Omega$ has to be estimated, mapping the parameter set $\Theta$ into a certain set $\Omega$, and that as an estimator of $f ( \theta )$ a statistic $T = T ( X)$ is chosen. If $T$ is such that

$${\mathsf E} _ \theta \{ T \} = \ \int\limits _ {\mathfrak X } T ( x) d {\mathsf P} _ \theta ( x) = f ( \theta )$$

holds for $\theta \in \Theta$, then $T$ is called an unbiased estimator of $f ( \theta )$. An unbiased estimator is frequently called free of systematic errors.

## Contents

### Example 1.

Let $X _ {1} \dots X _ {n}$ be random variables having the same expectation $\theta$, that is,

$${\mathsf E} \{ X _ {1} \} = \dots = {\mathsf E} \{ X _ {n} \} = \theta .$$

In that case the statistic

$$T = c _ {1} X _ {1} + \dots + c _ {n} X _ {n} ,\ \ c _ {1} + \dots + c _ {n} = 1 ,$$

is an unbiased estimator of $\theta$. In particular, the arithmetic mean of the observations, $\overline{X}\; = ( X _ {1} + \dots + X _ {n} ) / n$, is an unbiased estimator of $\theta$. In this example $f ( \theta ) \equiv \theta$.

### Example 2.

Let $X _ {1} \dots X _ {n}$ be independent random variables having the same probability law with distribution function $F ( x)$, that is,

$${\mathsf P} \{ X _ {i} < x \} = F ( x) ,\ | x | < \infty ,\ \ i = 1 \dots n .$$

In this case the empirical distribution function $F _ {n} ( x)$ constructed from the observations $X _ {1} \dots X _ {n}$ is an unbiased estimator of $F ( x)$, that is, ${\mathsf E} \{ F _ {n} ( x) \} = F ( x)$, $| x | < \infty$.

### Example 3.

Let $T = T ( X)$ be an unbiased estimator of a parameter $\theta$, that is, ${\mathsf E} \{ T \} = \theta$, and assume that $f ( \theta ) = a \theta + b$ is a linear function. In that case the statistic $a T + b$ is an unbiased estimator of $f ( \theta )$.

The next example shows that there are cases in which unbiased estimators exist and are even unique, but they may turn out to be useless.

### Example 4.

Let $X$ be a random variable subject to the geometric distribution with parameter of success $\theta$, that is, for any natural number $k$,

$${\mathsf P} \{ X = k \mid \theta \} = \theta ( 1 - \theta ) ^ {k-} 1 ,\ 0 \leq \theta \leq 1 .$$

If $T = T ( X)$ is an unbiased estimator of the parameter $\theta$, it must satisfy the unbiasedness equation ${\mathsf E} \{ T \} = \theta$, that is,

$$\sum _ { k= } 1 ^ \infty T ( k) \theta ( 1 - \theta ) ^ {k-} 1 = \theta .$$

The unique solution of this equation is

$$T ( X) = \ \left \{ \begin{array}{ll} 1 & \textrm{ if } X = 1 , \\ 0 & \textrm{ if } X \geq 2 . \\ \end{array} \right .$$

Evidently, $T$ is good only when $\theta$ is very close to 1 or 0, otherwise $T$ carries no useful information on $\theta$.

### Example 5.

Suppose that a random variable $X$ has the binomial law with parameters $n$ and $\theta$, that is, for any $k = 0 \dots n$,

$${\mathsf P} \{ X = k \mid n , \theta \} = \ \left ( \begin{array}{c} n \\ k \end{array} \right ) \theta ^ {k} ( 1 - \theta ) ^ {n-} k ,\ 0 < \theta < 1 .$$

It is known that the best unbiased estimator of the parameter $\theta$( in the sense of minimum quadratic risk) is the statistic $T = X / n$. Nevertheless, if $\theta$ is irrational, ${\mathsf P} \{ T = \theta \} = 0$. This example reflects a general property of random variables that, generally speaking, a random variable need not take values that agree with its expectation. And finally, cases are possible when unbiased estimators do not exist at all. Thus, if under the conditions of Example 5 one takes as the function to be estimated $f ( \theta ) = 1 / \theta$, then (see Example 6) there is no unbiased estimator $T ( X)$ for $1 / \theta$.

The preceding examples demonstrate that the concept of an unbiased estimator in its very nature does not necessarily help an experimenter to avoid all the complications that arise in the construction of statistical estimators, since an unbiased estimator may turn out to be very good and even totally useless; it may not be unique or may not exist at all. Moreover, an unbiased estimator, like every point estimator, also has the following deficiency. It only gives an approximate value for the true value of the quantity to be estimated; this quantity was not known before the experiment and remains unknown after it has been performed. So, in the problem of constructing statistical point estimators there is no serious justification for the fact that in all cases they should produce the resulting unbiased estimator, unless it is assumed that the study of unbiased estimators leads to a simple priority theory. For example, the Rao–Cramér inequality has a simple form for unbiased estimators. Namely, if $T = T ( X)$ is an unbiased estimator for a function $f ( \theta )$, then under fairly broad conditions of regularity on the family $\{ {\mathsf P} _ \theta \}$ and the function $f ( \theta )$, the Rao–Cramér inequality implies that

$$\tag{1 } {\mathsf D} \{ T \} = \ {\mathsf E} \{ | T - f ( \theta ) | ^ {2} \} \geq \frac{1}{I ( \theta ) } f ^ { \prime } ( \theta ) ^ {2} ,$$

where $I ( \theta )$ is the Fisher amount of information for $\theta$. Thus, there is a lower bound for the variance of an unbiased estimator of $f ( \theta )$, namely, $f ^ { \prime } ( \theta ) / I ( \theta )$. In particular, if $f ( \theta ) \equiv \theta$, then it follows from (1) that

$${\mathsf D} \{ T \} \geq \frac{1}{I ( \theta ) } .$$

A statistical estimator for which equality is attained in the Rao–Cramér inequality is called efficient (cf. Efficient estimator). Thus, the statistic $T = X / n$ in Example 5 is an efficient unbiased estimator of the parameter $\theta$ of the binomial law, since

$${\mathsf D} \{ T \} = \frac{1}{n} \theta ( 1 - \theta )$$

and

$$I ( \theta ) = {\mathsf E} \left \{ \left [ \frac \partial {\partial \theta } \mathop{\rm log} [ \theta ^ {X} ( 1 - \theta ) ^ {n-} X \right ] ^ {2} \right \} = \ \frac{n}{\theta ( 1 - \theta ) } ,$$

that is, $T = X / n$ is the best point estimator of $\theta$ in the sense of minimum quadratic risk in the class of all unbiased estimators.

Naturally, an experimenter is interested in the case when the class of unbiased estimators is rich enough to allow the choice of the best unbiased estimator in some sense. In this context an important role is played by the Rao–Blackwell–Kolmogorov theorem, which allows one to construct an unbiased estimator of minimal variance. This theorem asserts that if the family $\{ {\mathsf P} _ \theta \}$ has a sufficient statistic $\psi = \psi ( X)$ and $T = T ( X)$ is an arbitrary unbiased estimator of a function $f ( \theta )$, then the statistic $T ^ {*} = {\mathsf E} _ \theta \{ T \mid \psi \}$ obtained by averaging $T$ over the fixed sufficient statistic $\psi$ has a risk not exceeding that of $T$ relative to any convex loss function for all $\theta \in \Theta$. If the family $\{ {\mathsf P} _ \theta \}$ is complete, the statistic $T ^ {*}$ is uniquely determined. That is, the Rao–Blackwell–Kolmogorov theorem implies that unbiased estimators must be looked for in terms of sufficient statistics, if they exist. The practical value of the Rao–Blackwell–Kolmogorov theorem lies in the fact that it gives a recipe for constructing best unbiased estimators, namely: One has to construct an arbitrary unbiased estimator and then average it over a sufficient statistic.

### Example 6.

Suppose that a random variable $X$ has the Pascal distribution (a negative binomial distribution) with parameters $r$ and $\theta$( $r \geq 2$, $0 \leq \theta \leq 1$); that is,

$${\mathsf P} \{ X = k \mid r , \theta \} = \ \left ( \begin{array}{c} r + k - 1 \\ k \end{array} \right ) \theta ^ {r} ( 1 - \theta ) ^ {k} ,\ \ k = r , r + 1 ,\dots .$$

In this case the statistic $T = ( r - 1 ) / ( X - 1 )$ is an unbiased estimator of $\theta$. Since $T$ is expressed in terms of the sufficient statistic $X$ and the system of functions $1 , x , x ^ {2} \dots$ is complete on $[ 0 , 1 ]$, $T$ is the only unbiased estimator and, consequently, the best estimator of $\theta$.

### Example 7.

Let $X$ be a random variable having the binomial law with parameters $n$ and $\theta$. The generating function $Q( z)$ of this law can be expressed by the formula

$$Q ( z) = {\mathsf E} \{ z ^ {X} \} = \ ( z \theta + q ) ^ {n} ,\ \ q = 1 - \theta ,$$

which implies that for any integer $k = 1 \dots n$, the $k$- th derivative

$$Q ^ {(} k) ( z) = \ n ( n - 1 ) \dots ( n - k + 1 ) ( z \theta + q ) ^ {n - k } \theta ^ {k\ } =$$

$$= \ n ^ {[} k] ( z \theta + q ) ^ {n - k } \theta ^ {k} .$$

On the other hand,

$$Q ^ {(} k) ( 1) = \ {\mathsf E} [ X ( X - 1 ) \dots ( X - k + 1 ) ] = {\mathsf E} \{ X ^ {[} k] \} .$$

Hence,

$${\mathsf E} \left \{ \frac{1}{n ^ {[} k] } X ^ {[} k] \right \} = \theta ^ {k} ,$$

that is, the statistic

$$\tag{2 } T _ {k} ( X) = \ \frac{1}{n ^ {[} k] } X ^ {[} k]$$

is an unbiased estimator of $\theta ^ {k}$, and since $T _ {k} ( X)$ is expressed in terms of the sufficient statistic $X$ and the system of functions $1 , x , x ^ {2} \dots$ is complete on $[ 0 , 1 ]$, it follows that $T _ {k} ( X)$ is the only, hence the best, unbiased estimator of $\theta ^ {k}$.

In connection with this example the following question arises: What functions $f ( \theta )$ of the parameter $\theta$ admit an unbiased estimator? A.N. Kolmogorov  has shown that this only happens for polynomials of degree $m \leq n$. Thus, if

$$f ( \theta ) = a _ {0} + a _ {1} \theta + \dots + a _ {m} \theta ^ {m} ,\ \ 1 \leq m \leq n ,$$

then it follows from (2) that the statistic

$$T = a _ {0} + \sum _ { k= } 1 ^ { m } a _ {k} T _ {k} ( X)$$

is the only unbiased estimator of $f ( \theta )$. This result implies, in particular, that there is no unbiased estimator of $f ( \theta ) = 1 / \theta$.

### Example 8.

Let $X$ be a random variable subject to the Poisson law with parameter $\theta$; that is, for any integer $k = 0 , 1 \dots$

$${\mathsf P} \{ X = k \mid \theta \} = \ \frac{\theta ^ {k} }{k!} e ^ {- \theta } ,\ \ \theta > 0 .$$

Since ${\mathsf E} \{ X \} = \theta$, the observation of $X$ by itself is an unbiased estimator of its mathematical expectation $\theta$. In turn, an unbiased estimator of, say, $f ( \theta ) = \theta ^ {2}$ is $X ( X - 1 )$. More generally, the statistic

$$X ^ {[} r] = X ( X - 1 ) \dots ( X - r + 1 ) ,\ r = 1 , 2 \dots$$

is an unbiased estimator of $f ( \theta ) = \theta ^ {r}$. This fact implies, in particular, that the statistic

$$T ( X) = 1 + \sum _ { r= } 1 ^ \infty ( - 1 ) ^ {r} ( X) ^ {[} r]$$

is an unbiased estimator of the function $f ( \theta ) = ( 1 + \theta ) ^ {-} 1$, $0 < \theta < 1$. Quite generally, if $f ( \theta )$ admits an unbiased estimator, then the unbiasedness equation ${\mathsf E} \{ T ( X) \} = f ( \theta )$ must hold for it, which is equivalent to

$$\sum _ { k= } 0 ^ \infty T ( k) \frac{\theta ^ {k} }{k!} e ^ {- \theta } = f ( \theta ) .$$

From this one deduces that an unbiased estimator exists for any function $f ( \theta )$ that admits a power series expansion in its domain of definition $\Theta \subset \mathbf R _ {1} ^ {+}$.

### Example 9.

Suppose that the independent random variables $X _ {1} \dots X _ {n}$ have the same Poisson law with parameter $\theta$, $\theta > 0$. The generating function of this law, which can be expressed by the formula

$$g _ {z} ( \theta ) = \mathop{\rm exp} \{ \theta ( z - 1 ) \} ,$$

is an entire analytic function and hence has a unique unbiased estimator. In this case a sufficient statistic is $X = X _ {1} + {} \dots + X _ {n}$, which has the Poisson law with parameter $n \theta$. If $T ( X)$ is an unbiased estimator of $g _ {z} ( \theta )$, then it must satisfy the unbiasedness equation

$${\mathsf E} _ {0} \{ T ( X) \} = g _ {z} ( \theta ) = \ e ^ {\theta ( z- 1) } ,$$

which implies that

$$T ( X) = \ \left \{ \begin{array}{ll} \left ( \begin{array}{c} X \\ k \end{array} \right ) \left ( \frac{1}{n} \right ) ^ {k} \left ( 1 - \frac{1}{n} \right ) ^ {X-} k , & 0 \leq k \leq X , \\ 0, & \textrm{ otherwise } ; \\ \end{array} \right.$$

that is, an unbiased estimator of the generating function of the Poisson law is the generating function of the binomial law with parameters $X$ and $1 / n$.

Examples 6–9 demonstrate that in certain cases, which occur quite frequently in practice, the problem of constructing best estimators is easily solvable, provided that one restricts attention to the class of unbiased estimators. Kolmogorov  has considered the problem of constructing unbiased estimators, in particular, for the distribution function of a normal law with unknown parameters. A more general definition of an unbiased estimator is due to E. Lehmann , according to whom a statistical estimator $T = T ( X)$ of a parameter $\theta$ is called unbiased relative to a loss function $L ( \theta , T )$ if

$${\mathsf E} _ \theta \{ L ( \theta ^ \prime , T( X) ) \} \geq {\mathsf E} _ \theta \{ L ( \theta , T ( X) ) \} \ \ \textrm{ for all } \ \theta , \theta ^ \prime \in \Theta .$$

There is also a modification of this definition (see ). Yu.V. Linnik and his students (see ) have established that under fairly wide assumptions the best unbiased estimator is independent of the loss function.

How to Cite This Entry:
Unbiased estimator. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Unbiased_estimator&oldid=49645
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article