Rao-Cramér inequality
Cramér–Rao inequality, Fréchet inequality, information inequality
An inequality in mathematical statistics that establishes a lower bound for the risk corresponding to a quadratic loss function in the problem of estimating an unknown parameter.
Suppose that the probability distribution of a random vector $ X = (X_{1},\ldots,X_{n}) $ with values in the $ n $-dimensional Euclidean space $ \mathbb{R}^{n} $ is defined by a density function $ p(x |\theta) $, where $ x = (x_{1},\ldots,x_{n})^{\intercal} $ and $ \theta \in \Theta \subseteq \mathbb{R} $. Suppose that a statistic $ T = T(X) $ such that $$ {\mathsf{E}_{\theta}}[T] = \theta + b(\theta) $$ is used as an estimator for the unknown scalar parameter $ \theta $, where $ b(\theta) $ is a differentiable function, called the bias of $ T $. Then under certain regularity conditions on the family $ (p(x|\theta))_{\theta \in \Theta} $, one of which is that the Fisher information $$ I(\theta) \stackrel{\text{df}}{=} \mathsf{E} \! \left[ \left\{ \frac{\partial \ln(p(X|\theta))}{\partial \theta} \right\}^{2} \right] $$ is not zero, the so-called Cramér–Rao inequality $$ \mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] \geq \frac{[1 + b'(\theta)]^{2}}{I(\theta)} + {b^{2}}(\theta) \qquad (1) $$ holds. This inequality gives a lower bound for the mean-square error $ \mathsf{E}_{\theta} \! \left[ |T - \theta|^{2} \right] $ of all estimators $ T $ for the unknown parameter $ \theta $ that have the same bias function $ b(\theta) $.
In particular, if $ T $ is an unbiased estimator for $ \theta $, i.e., if $ {\mathsf{E}_{\theta}}[T] = \theta $, then (1) implies that $$ \mathsf{D} T = \mathsf{E_{\theta}} \! \left[ |T - \theta|^{2} \right] \geq \frac{1}{I(\theta)}. \quad (2) $$ Thus, in this case, the Cramér–Rao inequality provides a lower bound for the variance of the unbiased estimators $ T $ for $ \theta $, equal to $ \dfrac{1}{I(\theta))} $, and also demonstrates that the existence of consistent estimators is connected with unrestricted growth of the Fisher information $ I(\theta) $ as $ n \to \infty $. If equality is attained in (2) for a certain unbiased estimator $ T $, then $ T $ is optimal in the class of all unbiased estimators with regard to minimum quadratic risk; it is called an efficient estimator. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables subject to the same normal law $ N(\theta,1) $, then $ T = (X_{1},\ldots,X_{n}) / n $ is an efficient estimator of the unknown mean $ \theta $.
In general, equality in (2) is attained if and only if $ (p(x|\theta))_{\theta \in \Theta} $ is an exponential family, i.e., if the probability density of $ X $ can be represented in the form $$ p(x|\theta) = c(x) e^{u(\theta) \phi(x) - v(\theta)}, $$ in which case the sufficient statistic $ T = \phi(X) $ is an efficient estimator of its expectation $ \dfrac{v'(\theta)}{u'(\theta)} $. If no efficient estimator exists, the lower bound of the variances of the unbiased estimators can be refined, as the Cramér–Rao inequality does not necessarily give the greatest lower bound. For example, if $ X_{1},\ldots,X_{n} $ are independent random variables with the same normal distribution $ N(a^{1/3},1) $, then the greatest lower bound to the variance of unbiased estimators of $ a $ is equal to $$ \frac{9 a^{4}}{n} + \frac{18 a^{2}}{n^{2}} + \frac{6}{n^{3}}, $$ while $$ \frac{1}{I(\theta)} = \frac{9 a^{4}}{n}. $$ In general, absence of equality in (2) does not mean that the estimator that has been found is not optimal, as it may well be the only unbiased estimator.
There are different generalizations of the Cramér–Rao inequality to the case of a vector parameter, or to that of estimating a function of the parameter. Refinements of the lower bound in (2) play an important role in such cases.
The inequality (1) was independently obtained by M. Fréchet, C.R. Rao and H. Cramér.
References
[1] | H. Cramér, “Mathematical methods of statistics”, Princeton Univ. Press (1946). |
[2] | B.L. van der Waerden, “Mathematische Statistik”, Springer (1957). |
[3] | L.N. Bol’shev, “A refinement of the Cramér–Rao inequality”, Theory Probab. Appl., 6 (1961), pp. 295–301; Teor. Veryatnost. Primenen., 6: 3 (1961), pp. 319–326. |
[4a] | A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. I”, Sankhyā, 8: 1 (1946), pp. 1–14. |
[4b] | A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. II-III”, Sankhyā, 8: 3 (1947), pp. 201–218. |
[4c] | A. Bhattacharyya, “On some analogues of the amount of information and their uses in statistical estimation, Chapt. IV”, Sankhyā, 8: 4 (1948), pp. 315–328. |
References
[a1] | E.L. Lehmann, “Theory of point estimation”, Wiley (1983). |
Rao-Cramér inequality. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Rao-Cram%C3%A9r_inequality&oldid=23510