Namespaces
Variants
Actions

Difference between revisions of "Rao-Cramér inequality"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (link)
 
(3 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
An inequality in mathematical statistics that establishes a lower bound for the risk corresponding to a quadratic loss function in the problem of estimating an unknown parameter.
 
An inequality in mathematical statistics that establishes a lower bound for the risk corresponding to a quadratic loss function in the problem of estimating an unknown parameter.
  
Suppose that the probability distribution of a random vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775601.png" /> with values in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775602.png" />-dimensional Euclidean space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775603.png" /> is defined by a density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775604.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775605.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775606.png" />. Suppose that a statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775607.png" /> such that
+
Suppose that the probability distribution of a random vector $ X = (X_{1},\ldots,X_{n}) $, with values in the $ n $-dimensional Euclidean space $ \mathbb{R}^{n} $, is defined by a density function $ p(x  |\theta) $, where $ x = (x_{1},\ldots,x_{n})^{\intercal} $ and $ \theta \in \Theta \subseteq \mathbb{R} $. Suppose that a statistic $ T = T(X) $ such that
 +
$$
 +
{\mathsf{E}_{\theta}}[T] = \theta + b(\theta)
 +
$$
 +
is used as an estimator for the unknown scalar parameter $ \theta $, where $ b(\theta) $ is a differentiable function called the '''bias''' of $ T $. Then under certain regularity conditions on the family $ (p(x|\theta))_{\theta \in \Theta} $, one of which is that the '''Fisher information'''
 +
$$
 +
I(\theta) \stackrel{\text{df}}{=} \mathsf{E} \! \left[ \left\{ \frac{\partial \ln(p(X|\theta))}{\partial \theta} \right\}^{2} \right]
 +
$$
 +
is not zero, the so-called '''Cramér–Rao inequality'''
 +
$$
 +
\mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] \geq \frac{[1 + b'(\theta)]^{2}}{I(\theta)} + {b^{2}}(\theta) \qquad (1)
 +
$$
 +
holds. This inequality gives a lower bound for the mean squared error $ \mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] $ of all estimators $ T $ for the unknown parameter $ \theta $ that have the same bias function $ b(\theta) $.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775608.png" /></td> </tr></table>
+
In particular, if $ T $ is an [[Unbiased estimator|unbiased estimator]] for $ \theta $, i.e., if $ {\mathsf{E}_{\theta}}[T] = \theta $, then (1) implies that
 
+
$$
is used as an estimator for the unknown scalar parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r0775609.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756010.png" /> is a differentiable function, called the bias of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756011.png" />. Then under certain regularity conditions on the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756012.png" />, one of which is that the Fisher information
+
\mathsf{D} T = \mathsf{E_{\theta}} \! \left[ |T - \theta|^{2} \right] \geq \frac{1}{I(\theta)}. \quad (2)
 
+
$$
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756013.png" /></td> </tr></table>
+
Thus, in this case, the Cramér–Rao inequality provides a lower bound for the variance of the unbiased estimators $ T $ for $ \theta $, equal to $ \dfrac{1}{I(\theta)} $, and also demonstrates that the existence of [[Consistent estimator|consistent estimators]] is connected with unrestricted growth of the Fisher information $ I(\theta) $ as $ n \to \infty $. If equality is attained in (2) for a certain unbiased estimator $ T $, then $ T $ is optimal in the class of all unbiased estimators with regard to minimum quadratic risk; in this case, it is called an '''efficient estimator'''. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables subject to the same normal law $ N(\theta,1) $, then $ T = (X_{1},\ldots,X_{n}) / n $ is an efficient estimator of the unknown mean $ \theta $.
 
 
is not zero, the Cramér–Rao inequality
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756014.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
 
 
 
holds. This inequality gives a lower bound for the mean-square error <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756015.png" /> of all estimators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756016.png" /> for the unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756017.png" /> that have the same bias function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756018.png" />.
 
 
 
In particular, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756019.png" /> is an [[Unbiased estimator|unbiased estimator]] for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756020.png" />, that is, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756021.png" />, then (1) implies that
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756022.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
 
 
 
Thus, in this case the Cramér–Rao inequality provides a lower bound for the variance of the unbiased estimators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756023.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756024.png" />, equal to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756025.png" />, and also demonstrates that the existence of consistent estimators (cf. [[Consistent estimator|Consistent estimator]]) is connected with unrestricted growth of the Fisher information <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756026.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756027.png" />. If equality is attained in (2) for a certain unbiased estimator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756028.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756029.png" /> is optimal in the class of all unbiased estimators with regard to minimum quadratic risk; it is called an efficient estimator. For example, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756030.png" /> are independent random variables subject to the same normal law <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756031.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756032.png" /> is an efficient estimator of the unknown mean <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756033.png" />.
 
 
 
In general, equality in (2) is attained if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756034.png" /> is an exponential family, that is, if the probability density of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756035.png" /> can be represented in the form
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756036.png" /></td> </tr></table>
 
 
 
in which case the sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756037.png" /> is an efficient estimator of its expectation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756038.png" />. If no efficient estimator exists, the lower bound of the variances of the unbiased estimators can be refined, since the Cramér–Rao inequality does not necessarily give the greatest lower bound. For example, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756039.png" /> are independent random variables with the same normal distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756040.png" />, then the greatest lower bound to the variance of unbiased estimators of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756041.png" /> is equal to
 
 
 
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756042.png" /></td> </tr></table>
 
  
 +
In general, equality in (2) is attained if and only if $ (p(x|\theta))_{\theta \in \Theta} $ is an exponential family, i.e., if the probability density of $ X $ can be represented in the form
 +
$$
 +
p(x|\theta) = c(x) e^{u(\theta) \phi(x) - v(\theta)},
 +
$$
 +
in which case the [[sufficient statistic]] $ T = \phi(X) $ is an efficient estimator of its expectation $ \dfrac{v'(\theta)}{u'(\theta)} $. If no efficient estimator exists, then the lower bound for the variance of the unbiased estimators can be refined as the Cramér–Rao inequality does not necessarily give the greatest lower bound. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables with the same normal distribution $ N(a^{1/3},1) $, then the greatest lower bound for the variance of unbiased estimators of $ a $ is equal to
 +
$$
 +
\frac{9 a^{4}}{n} + \frac{18 a^{2}}{n^{2}} + \frac{6}{n^{3}},
 +
$$
 
while
 
while
 +
$$
 +
\frac{1}{I(\theta)} = \frac{9 a^{4}}{n}.
 +
$$
 +
In general, absence of equality in (2) does not mean that the estimator found is not optimal, as it may well be the only unbiased estimator.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/r/r077/r077560/r07756043.png" /></td> </tr></table>
+
There are different generalizations of the Cramér–Rao inequality to the case of a vector parameter, or to that of estimating a function of the parameter. Refinements of the lower bound in (2) play an important role in such cases.
 
 
In general, absence of equality in (2) does not mean that the estimator that has been found is not optimal, since it may well be the only unbiased estimator.
 
 
 
There are different generalizations of the Cramér–Rao inequality, to the case of a vector parameter, or to that of estimating a function of the parameter. Refinements of the lower bound in (2) play an important role in such cases.
 
  
 
The inequality (1) was independently obtained by M. Fréchet, C.R. Rao and H. Cramér.
 
The inequality (1) was independently obtained by M. Fréchet, C.R. Rao and H. Cramér.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  H. Cramér,  "Mathematical methods of statistics" , Princeton Univ. Press  (1946)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  L.N. Bol'shev,  "A refinement of the Cramér–Rao inequality"  ''Theory Probab. Appl.'' , '''6'''  (1961)  pp. 295–301  ''Teor. Veryatnost. Primenen.'' , '''6''' :  3  (1961)  pp. 319–326</TD></TR><TR><TD valign="top">[4a]</TD> <TD valign="top">  A. Bhattacharyya,  "On some analogues of the amount of information and their uses in statistical estimation, Chapt. I"  ''Sankhyā'' , '''8''' :  1  (1946)  pp. 1–14</TD></TR><TR><TD valign="top">[4b]</TD> <TD valign="top">  A. Bhattacharyya,  "On some analogues of the amount of information and their uses in statistical estimation, Chapt. II-III"  ''Sankhyā'' , '''8''' :  3  (1947)  pp. 201–218</TD></TR><TR><TD valign="top">[4c]</TD> <TD valign="top">  A. Bhattacharyya,  "On some analogues of the amount of information and their uses in statistical estimation, Chapt. IV"  ''Sankhyā'' , '''8''' :  4  (1948)  pp. 315–328</TD></TR></table>
 
  
 +
<table>
 +
<TR><TD valign="top">[1]</TD><TD valign="top">
 +
H. Cramér, “Mathematical methods of statistics”, Princeton Univ. Press (1946).</TD></TR>
 +
<TR><TD valign="top">[2]</TD><TD valign="top">
 +
B.L. van der Waerden, “Mathematische Statistik”, Springer (1957).</TD></TR>
 +
<TR><TD valign="top">[3]</TD><TD valign="top">
 +
L.N. Bol’shev, “A refinement of the Cramér–Rao inequality”, ''Theory Probab. Appl.'', '''6''' (1961), pp. 295–301; ''Teor. Veryatnost. Primenen.'', '''6''': 3 (1961), pp. 319–326.</TD></TR>
 +
<TR><TD valign="top">[4a]</TD><TD valign="top">
 +
A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. I”, ''Sankhyā'', '''8''': 1 (1946), pp. 1–14.</TD></TR>
 +
<TR><TD valign="top">[4b]</TD><TD valign="top">
 +
A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. II-III”, ''Sankhyā'', '''8''': 3 (1947), pp. 201–218.</TD></TR>
 +
<TR><TD valign="top">[4c]</TD> <TD valign="top">
 +
A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. IV”, ''Sankhyā'', '''8''': 4 (1948), pp. 315–328.</TD></TR>
 +
</table>
  
 +
====References====
  
====Comments====
+
<table>
 
+
<TR><TD valign="top">[a1]</TD><TD valign="top">
 
+
E.L. Lehmann, “Theory of point estimation”, Wiley (1983).</TD></TR>
====References====
+
</table>
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> E.L. Lehmann,   "Theory of point estimation" , Wiley (1983)</TD></TR></table>
 

Latest revision as of 10:31, 16 July 2021

Cramér–Rao inequality, Fréchet inequality, information inequality

An inequality in mathematical statistics that establishes a lower bound for the risk corresponding to a quadratic loss function in the problem of estimating an unknown parameter.

Suppose that the probability distribution of a random vector $ X = (X_{1},\ldots,X_{n}) $, with values in the $ n $-dimensional Euclidean space $ \mathbb{R}^{n} $, is defined by a density function $ p(x |\theta) $, where $ x = (x_{1},\ldots,x_{n})^{\intercal} $ and $ \theta \in \Theta \subseteq \mathbb{R} $. Suppose that a statistic $ T = T(X) $ such that $$ {\mathsf{E}_{\theta}}[T] = \theta + b(\theta) $$ is used as an estimator for the unknown scalar parameter $ \theta $, where $ b(\theta) $ is a differentiable function called the bias of $ T $. Then under certain regularity conditions on the family $ (p(x|\theta))_{\theta \in \Theta} $, one of which is that the Fisher information $$ I(\theta) \stackrel{\text{df}}{=} \mathsf{E} \! \left[ \left\{ \frac{\partial \ln(p(X|\theta))}{\partial \theta} \right\}^{2} \right] $$ is not zero, the so-called Cramér–Rao inequality $$ \mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] \geq \frac{[1 + b'(\theta)]^{2}}{I(\theta)} + {b^{2}}(\theta) \qquad (1) $$ holds. This inequality gives a lower bound for the mean squared error $ \mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] $ of all estimators $ T $ for the unknown parameter $ \theta $ that have the same bias function $ b(\theta) $.

In particular, if $ T $ is an unbiased estimator for $ \theta $, i.e., if $ {\mathsf{E}_{\theta}}[T] = \theta $, then (1) implies that $$ \mathsf{D} T = \mathsf{E_{\theta}} \! \left[ |T - \theta|^{2} \right] \geq \frac{1}{I(\theta)}. \quad (2) $$ Thus, in this case, the Cramér–Rao inequality provides a lower bound for the variance of the unbiased estimators $ T $ for $ \theta $, equal to $ \dfrac{1}{I(\theta)} $, and also demonstrates that the existence of consistent estimators is connected with unrestricted growth of the Fisher information $ I(\theta) $ as $ n \to \infty $. If equality is attained in (2) for a certain unbiased estimator $ T $, then $ T $ is optimal in the class of all unbiased estimators with regard to minimum quadratic risk; in this case, it is called an efficient estimator. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables subject to the same normal law $ N(\theta,1) $, then $ T = (X_{1},\ldots,X_{n}) / n $ is an efficient estimator of the unknown mean $ \theta $.

In general, equality in (2) is attained if and only if $ (p(x|\theta))_{\theta \in \Theta} $ is an exponential family, i.e., if the probability density of $ X $ can be represented in the form $$ p(x|\theta) = c(x) e^{u(\theta) \phi(x) - v(\theta)}, $$ in which case the sufficient statistic $ T = \phi(X) $ is an efficient estimator of its expectation $ \dfrac{v'(\theta)}{u'(\theta)} $. If no efficient estimator exists, then the lower bound for the variance of the unbiased estimators can be refined as the Cramér–Rao inequality does not necessarily give the greatest lower bound. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables with the same normal distribution $ N(a^{1/3},1) $, then the greatest lower bound for the variance of unbiased estimators of $ a $ is equal to $$ \frac{9 a^{4}}{n} + \frac{18 a^{2}}{n^{2}} + \frac{6}{n^{3}}, $$ while $$ \frac{1}{I(\theta)} = \frac{9 a^{4}}{n}. $$ In general, absence of equality in (2) does not mean that the estimator found is not optimal, as it may well be the only unbiased estimator.

There are different generalizations of the Cramér–Rao inequality to the case of a vector parameter, or to that of estimating a function of the parameter. Refinements of the lower bound in (2) play an important role in such cases.

The inequality (1) was independently obtained by M. Fréchet, C.R. Rao and H. Cramér.

References

[1] H. Cramér, “Mathematical methods of statistics”, Princeton Univ. Press (1946).
[2] B.L. van der Waerden, “Mathematische Statistik”, Springer (1957).
[3] L.N. Bol’shev, “A refinement of the Cramér–Rao inequality”, Theory Probab. Appl., 6 (1961), pp. 295–301; Teor. Veryatnost. Primenen., 6: 3 (1961), pp. 319–326.
[4a] A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. I”, Sankhyā, 8: 1 (1946), pp. 1–14.
[4b] A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. II-III”, Sankhyā, 8: 3 (1947), pp. 201–218.
[4c] A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. IV”, Sankhyā, 8: 4 (1948), pp. 315–328.

References

[a1] E.L. Lehmann, “Theory of point estimation”, Wiley (1983).
How to Cite This Entry:
Rao-Cramér inequality. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Rao-Cram%C3%A9r_inequality&oldid=22965
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article