Sufficient statistic
for a family of probability distributions or for a parameter
A statistic (a vector random variable) such that for any event there exists a version of the conditional probability
which is independent of
. This is equivalent to the requirement that the conditional distribution, given
, of any other statistic
is independent of
.
The knowledge of the sufficient statistic yields exhaustive material for statistical inferences about the parameter
, since no complementary statistical data can add anything to the information about the parameter contained in the distribution of
. This property is mathematically expressed as one of the results of the theory of statistical decision making which says that the set of decision rules based on a sufficient statistic forms an essentially complete class. The transition from the initial family of distributions to the family of distributions of the sufficient statistic is known as reduction of the statistical problem. The meaning of the reduction is a decrease (sometimes a very significant one) in the dimension of the observation space.
In practice, a sufficient statistic is found from the following factorization theorem. Let a family be dominated by a
-finite measure
and let
be the density of
with respect to the measure
. A statistic
is sufficient for the family
if and only if
![]() | (*) |
where and
are non-negative measurable functions (
is independent of
). For discrete distributions the "counting" measure may be taken as
, and
in relation (*) has the meaning of the probability of the elementary event
.
E.g., let be a sequence of independent random variables which assume the value one with an unknown probability
and the value zero with probability
(a Bernoulli scheme). Then
![]() |
Equation (*) is satisfied if
![]() |
Thus, the empirical frequency
![]() |
is a sufficient statistic for the unknown probability in the Bernoulli scheme.
Let be a sequence of independent, normally distributed variables with unknown mean
and unknown variance
. The joint density of the distributions of
with respect to Lebesgue measure is given by the expression
![]() |
![]() |
![]() |
which depends on only by means of the variables
![]() |
For this reason the vector statistic
![]() |
is a sufficient statistic for the two-dimensional parameter . Here, the pair: sample mean
![]() |
and sample variance
![]() |
will also be a sufficient statistic, since the variables
![]() |
can be expressed in terms of and
.
Many sufficient statistics may exist for a given family of distributions. In particular, the totality of all observations (in the example discussed above, ) is a trivial sufficient statistic. However, of main interest are statistics which permit a real reduction of the statistical problem. A sufficient statistic is known as minimal or necessary if it is a function of any other sufficient statistic. A necessary sufficient statistic realizes the utmost possible reduction of a statistical problem. In the examples discussed above the obtained sufficient statistics are also necessary.
An important application of the concept of sufficiency is the method of improvement of unbiased estimators, based on the Rao–Blackwell–Kolmogorov theorem: If is a sufficient statistic for the family
, and if
is an arbitrary statistic assuming values in the vector space
, then the inequality
![]() |
where is the conditional expectation of the statistic
with respect to
(which is in fact independent of
by virtue of the sufficiency of
), holds for any real continuous convex function
on
. Often the loss function
is taken to be a positive-definite quadratic form on
.
A statistic is said to be a complete statistic if it follows from
,
, that
almost surely with respect to
,
. A corollary of the Rao–Blackwell–Kolmogorov theorem states that if a complete sufficient statistic
exists, then it is the best unbiased estimator, uniformly in
, of its expectation
. The examples above describe such a situation. Thus, the empirical frequency
is the uniformly best unbiased estimator of the probability
in the Bernoulli scheme, while the sample mean
and the variance
are the uniformly best unbiased estimators of the parameters
and
of the normal distribution.
On the theoretical level it may be more convenient to deal with sufficient -algebras rather than with sufficient statistics. If
is a family of distributions on a probability space
, then a sub-
-algebra
is said to be sufficient for
if for any event
there exists a version of the conditional probability
which is independent of
. A statistic
is sufficient if and only if the sub-
-algebra
generated by it is sufficient.
References
[1] | P.R. Halmos, L.I. Savage, "Application of the Radon–Nikodym theorem to the theory of sufficient statistics" Ann. Math. Stat. , 20 (1949) pp. 225–241 |
[2] | A.N. Kolmogorov, "Unbiased estimators" Izv. Akad. Nauk SSSR Ser. Mat. , 14 : 4 (1950) pp. 303–326 (In Russian) ((English translation in: Selected Works, Vol. 2 (Probability Theory and Mathematical Statistics), Kluwer, 1992, pp. 369–394.)) |
[3] | C.R. Rao, "Linear statistical inference and its application" , Wiley (1973) |
Comments
References
[a1] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |
[a2] | C.R. Rao, "Characterization problems in mathematical statistics" , Wiley (1973) pp. Chapt. 8 (Translated from Russian) |
Sufficient statistic. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Sufficient_statistic&oldid=17205