Exponential family of probability distributions
A certain model (i.e., a set of probability distributions on the same measurable space) in statistics which is widely used and studied for two reasons:
i) many classical models are actually exponential families;
ii) most of the classical methods of estimation of parameters and testing work successfully when the model is an exponential family.
The definitions found in the literature can be rather inelegant or lacking rigour. A mathematically satisfactory definition is obtained by first defining a significant particular case, namely the natural exponential family, and then using it to define general exponential families.
Given a finite-dimensional real linear space , denote by the space of linear forms from to . One writes instead of . Let be a positive measure on (equipped with Borel sets), and assume that is not concentrated on an affine hyperplane of . Denote by
its Laplace transform and by the subset of on which is finite. It is easily seen that is convex. Assume that the interior of is not empty. The set of probability measures (cf. also Probability measure) on :
where
is called the natural exponential family (abbreviated NEF) generated by . The mapping
is called the canonical parametrization of . A simple example of a natural exponential family is given by the family of binomial distributions , , with fixed parameter , generated by the measure
where is the Dirac measure (cf. Measure) on (cf. also Binomial distribution). Here, with and one has
Note that the canonical parametrization by generally differs from a more familiar parametrization if the natural exponential family is a classical family. This is illustrated by the above example, where the parametrization by is traditional.
A general exponential family (abbreviated GEF) is defined on an abstract measure space (the measure is not necessarily bounded) by a measurable mapping from to a finite-dimensional real linear space . This mapping must have the following property: the image of by must be such that is not concentrated on an affine hyperplane of , and such that is not empty. Under these circumstances, the general exponential family on generated by is:
where
In this case, the NEF on is said to be associated to the GEF . In a sense, all results about GEFs are actually results about their associated NEF. The dimension of is called the order of the general exponential family.
The most celebrated example of a general exponential family is the family of the normal distributions on , where the mean and the variance are both unknown parameters (cf. also Normal distribution). Here, , the space is and is . Here, again, the canonical parametrization is not the classical one but is related to it by and . The associated NEF is concentrated on a parabola in .
A common incorrect statement about such a model says that it belongs to "the" exponential family. Such a statement is induced by a confusion between a definite probability distribution and a family of them. When a NEF is concentrated on the set of non-negative integers, its elements are sometimes called "power series" distributions, since the Laplace transform is more conveniently written , where is analytic around . The same confusion arises here.
There are several variations of the above definition of a GEF: mostly, the parameter is taken to belong to and not only to , thus obtaining what one may call a full-NEF. A full-GEF is similarly obtained. However, many results are not true anymore for such an extension: for instance, this is the case for the NEF on generated by a positive stable distribution with parameter : this NEF is a family of inverse Gaussian distributions, with exponential moments, while has no expectation and belongs to the full-NEF. A more genuine extension gives curved exponential families (abbreviated CEF). In this case, the set of parameters is restricted to a non-affine subset of , generally a manifold. However, this extension is in a sense too general, since most of the models in statistics can be regarded as a CEF. The reason is the following: Starting from a statistical model of the form , where is a subset of , then is a CEF if and only if the linear subspace of the space generated by the set is finite dimensional. This is also why exponential families constructed on infinite-dimensional spaces are uninteresting (at least without further structure). For these CEFs, there are no really general results available concerning the application of the maximum-likelihood method. General references are [a2] and [a5].
The exponential dispersion model (abbreviated, EDP) is a concept which is related to natural exponential families as follows: starting from the NEF on , the Jorgensen set is the set of positive such that there exists a positive measure on whose Laplace transform is (see [a4]. Trivially, it contains all positive integers. The model
is the exponential dispersion model generated by . It has the following striking property: Let be fixed in , let be in and let be independent random variables with respective distributions , with . Then the distribution of conditioned by does not depend on . The distribution of is obviously with . Furthermore, if the parameters are known, and if is unknown, then the maximum-likelihood method to estimate from the knowledge of the observations is the one obtained from the knowledge of . For instance, if the NEF is the Bernoulli family of distributions on and , if are independent Bernoulli random variables with the same unknown , then in order to estimate it is useless to keep track of the individual values of the . All necessary information about is contained in , which has a binomial distribution .
Thus, the problem of estimating the canonical parameter , given independent observations , for a NEF model is reduced to the problem of estimating with only one observation , whose distribution is in the NEF . See Natural exponential family of probability distributions for details about estimation by the maximum-likelihood method. When dealing with a GEF, the problem is reduced to the associated NEF.
Bayesian theory (cf. also Bayesian approach) also constitutes a successful domain of application of exponential families. Given a NEF and a positive measure on , consider the set of such that
is a probability for some number , and assume that this set is not empty. This set of a priori distributions on the parameter space is an example of a conjugate family. This means that if the random variable has distribution , then the distribution of conditioned by (a posteriori distribution) is for some depending on . See [a1] for a complete study; however, [a3] is devoted to the case , which has special properties and has, for many years, been the only serious study of the subject.
References
[a1] | S. Bar-Lev, P. Enis, G. Letac, "Sampling models which admit a given general exponential family as a conjugate family of priors" Ann. Statist. , 22 (1994) pp. 1555–1586 |
[a2] | O. Barndorff-Nielsen, "Information and exponential families in statistical theory" , Wiley (1978) |
[a3] | P. Diaconis, D. Ylvizaker, "Conjugate priors for exponential families" Ann. Statist. , 7 (1979) pp. 269–281 |
[a4] | B. Jorgensen, "Exponential dispersion models" J. R. Statist. Soc. Ser. B , 49 (1987) pp. 127–162 |
[a5] | G. Letac, "Lectures on natural exponential families and their variance functions" , Monogr. Mat. , 50 , Inst. Mat. Pura Aplic. Rio (1992) |
Exponential family of probability distributions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Exponential_family_of_probability_distributions&oldid=50285