Difference between revisions of "Invariance of a statistical procedure"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
− | + | <!-- | |
+ | i0521801.png | ||
+ | $#A+1 = 55 n = 0 | ||
+ | $#C+1 = 55 : ~/encyclopedia/old_files/data/I052/I.0502180 Invariance of a statistical procedure | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
− | + | {{TEX|auto}} | |
+ | {{TEX|done}} | ||
+ | |||
+ | The equivariance (see below) of some decision rule in a statistical problem, the statement of which admits of a group $ G $ | ||
+ | of symmetries, under this group $ G $. | ||
+ | The notion of invariance of a statistical procedure arises in the first instance in so-called parametric problems of mathematical statistics, when there is a priori information: the probability distribution $ P ( d \omega ) $ | ||
+ | of the outcomes $ \omega $ | ||
+ | of an observation belongs to a known family $ \{ {P _ \theta } : {\theta \in \Theta } \} $. | ||
+ | A statistical decision problem is said to be $ G $- | ||
+ | equivariant under a group $ G $ | ||
+ | of measurable transformations $ g $ | ||
+ | of a measurable space $ ( \Omega , B _ \Omega ) $ | ||
+ | of outcomes if the following conditions hold: 1) there is a homomorphism $ f $ | ||
+ | of $ G $ | ||
+ | onto a group $ \overline{G}\; $ | ||
+ | of transformations of the parameter space $ \Theta $, | ||
+ | |||
+ | $$ | ||
+ | f : g \rightarrow \overline{g}\; \in \overline{G}\; ,\ \forall g \in G , | ||
+ | $$ | ||
with the property | with the property | ||
− | + | $$ | |
+ | ( P _ \theta g ) ( \cdot ) = P _ {\overline{g}\; ( \theta ) } ( \cdot ) ,\ \ | ||
+ | \forall g \in G ; | ||
+ | $$ | ||
− | 2) there exists a homomorphism | + | 2) there exists a homomorphism $ h $ |
+ | of $ G $ | ||
+ | onto a group $ \widehat{G} $ | ||
+ | of measurable transformations of a measurable space $ ( D , B _ {D} ) $ | ||
+ | of decisions $ d $, | ||
− | + | $$ | |
+ | h : g \rightarrow \widehat{g} \in \widehat{G} ,\ \forall g \in G , | ||
+ | $$ | ||
with the property | with the property | ||
− | + | $$ | |
+ | L ( \overline{g}\; ( \theta ) , \widehat{g} ( d ) ) = L ( \theta , d ) , | ||
+ | $$ | ||
− | where | + | where $ L ( \theta , d ) $ |
+ | is the loss function; and 3) all the additional a priori information on the possible values of the parameter (the a priori density $ p ( \theta ) $, | ||
+ | the subdivision into alternatives $ \Theta = \Theta _ {1} \cup \dots \cup \Theta _ {s} $, | ||
+ | etc.) is $ G $- | ||
+ | invariant or $ G $- | ||
+ | equivariant. Under these conditions, the decision rule $ \delta : \omega \rightarrow \delta ( \omega ) \in D $, | ||
+ | whether deterministic or random, is called an invariant (more precisely, a $ G $- | ||
+ | equivariant) procedure if | ||
− | + | $$ | |
+ | \delta ( g ( \omega ) ) = \widehat{g} ( \delta ( \omega ) ) ,\ \ | ||
+ | \forall \omega \in \Omega ,\ \forall g \in G . | ||
+ | $$ | ||
The risk | The risk | ||
− | + | $$ | |
+ | r _ \delta ( \theta ) = {\mathsf E} _ \theta L ( \theta , \delta ( | ||
+ | \omega ) ) | ||
+ | $$ | ||
− | of an equivariant decision procedure | + | of an equivariant decision procedure $ \delta $ |
+ | is $ G $- | ||
+ | invariant; in particular, it does not depend on $ \theta $ | ||
+ | if the group $ G $ | ||
+ | acts transitively on $ \Theta $. | ||
− | In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter | + | In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter $ \theta \in \Theta $. |
+ | In particular, a procedure may lead to very small values of the risk for certain values of $ \theta $ | ||
+ | at the expense of worsening the quality for other equally-possible a priori values of the parameter. Equivariance guarantees to some extent that the approach is unbiased. When the group $ G $ | ||
+ | is sufficiently rich, there is an optimal invariant procedure with a uniformly minimal risk among the invariant procedures. | ||
− | Invariant procedures are widely applied in hypotheses testing (see also [[Invariant test|Invariant test]]) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of | + | Invariant procedures are widely applied in hypotheses testing (see also [[Invariant test|Invariant test]]) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of $ m $- |
+ | dimensional normal distributions | ||
− | + | $$ | |
+ | p ( \mathbf x , \pmb\alpha ) = | ||
+ | \frac{1}{( 2 \pi ) ^ {m/2} } | ||
+ | \mathop{\rm exp} | ||
+ | \left [ | ||
+ | \frac{- \sum _ {j} ( x _ {j} - \alpha _ {j} ) ^ {2} }{2} | ||
+ | \right ] | ||
+ | $$ | ||
− | with unit covariance matrix and quadratic loss function | + | with unit covariance matrix and quadratic loss function $ \sum _ {j} ( \delta _ {j} - \alpha _ {j} ) ^ {2} $, |
+ | the optimal equivariant estimator is the ordinary sample mean | ||
− | + | $$ | |
+ | \mathbf x ^ {*} = | ||
+ | \frac{\mathbf x ^ {(} 1) + \dots + \mathbf x ^ {(} N) }{N} | ||
+ | . | ||
+ | $$ | ||
− | Here the group | + | Here the group $ G $ |
+ | is given by the product of the group $ S _ {N} $ | ||
+ | of permutations of the observations and the group $ \mathop{\rm Ort} ( m ) $ | ||
+ | of motions of the Euclidean space $ \mathbf R ^ {m} $; | ||
+ | $ \overline{G}\; = \widehat{G} = \mathop{\rm Ort} ( m) $. | ||
+ | For $ m \geq 3 $, | ||
+ | there exist for this problem non-equivariant estimators leading to a smaller risk than for $ \mathbf x ^ {*} $ | ||
+ | for all $ \pmb\alpha $; | ||
+ | however, the region of essential "superefficiency" turns out to be insignificant and diminishes without bound as the size $ N $ | ||
+ | of the sample increases. The possibility of superefficient procedures is connected with the non-compactness of $ G $. | ||
− | Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions | + | Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions $ P $ |
+ | of outcomes is essentially infinite-dimensional, as well as in the construction of confidence sets for the parameter $ \theta $ | ||
+ | of the distribution in the presence of nuisance parameters. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR></table> | <table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR></table> |
Revision as of 22:13, 5 June 2020
The equivariance (see below) of some decision rule in a statistical problem, the statement of which admits of a group $ G $
of symmetries, under this group $ G $.
The notion of invariance of a statistical procedure arises in the first instance in so-called parametric problems of mathematical statistics, when there is a priori information: the probability distribution $ P ( d \omega ) $
of the outcomes $ \omega $
of an observation belongs to a known family $ \{ {P _ \theta } : {\theta \in \Theta } \} $.
A statistical decision problem is said to be $ G $-
equivariant under a group $ G $
of measurable transformations $ g $
of a measurable space $ ( \Omega , B _ \Omega ) $
of outcomes if the following conditions hold: 1) there is a homomorphism $ f $
of $ G $
onto a group $ \overline{G}\; $
of transformations of the parameter space $ \Theta $,
$$ f : g \rightarrow \overline{g}\; \in \overline{G}\; ,\ \forall g \in G , $$
with the property
$$ ( P _ \theta g ) ( \cdot ) = P _ {\overline{g}\; ( \theta ) } ( \cdot ) ,\ \ \forall g \in G ; $$
2) there exists a homomorphism $ h $ of $ G $ onto a group $ \widehat{G} $ of measurable transformations of a measurable space $ ( D , B _ {D} ) $ of decisions $ d $,
$$ h : g \rightarrow \widehat{g} \in \widehat{G} ,\ \forall g \in G , $$
with the property
$$ L ( \overline{g}\; ( \theta ) , \widehat{g} ( d ) ) = L ( \theta , d ) , $$
where $ L ( \theta , d ) $ is the loss function; and 3) all the additional a priori information on the possible values of the parameter (the a priori density $ p ( \theta ) $, the subdivision into alternatives $ \Theta = \Theta _ {1} \cup \dots \cup \Theta _ {s} $, etc.) is $ G $- invariant or $ G $- equivariant. Under these conditions, the decision rule $ \delta : \omega \rightarrow \delta ( \omega ) \in D $, whether deterministic or random, is called an invariant (more precisely, a $ G $- equivariant) procedure if
$$ \delta ( g ( \omega ) ) = \widehat{g} ( \delta ( \omega ) ) ,\ \ \forall \omega \in \Omega ,\ \forall g \in G . $$
The risk
$$ r _ \delta ( \theta ) = {\mathsf E} _ \theta L ( \theta , \delta ( \omega ) ) $$
of an equivariant decision procedure $ \delta $ is $ G $- invariant; in particular, it does not depend on $ \theta $ if the group $ G $ acts transitively on $ \Theta $.
In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter $ \theta \in \Theta $. In particular, a procedure may lead to very small values of the risk for certain values of $ \theta $ at the expense of worsening the quality for other equally-possible a priori values of the parameter. Equivariance guarantees to some extent that the approach is unbiased. When the group $ G $ is sufficiently rich, there is an optimal invariant procedure with a uniformly minimal risk among the invariant procedures.
Invariant procedures are widely applied in hypotheses testing (see also Invariant test) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of $ m $- dimensional normal distributions
$$ p ( \mathbf x , \pmb\alpha ) = \frac{1}{( 2 \pi ) ^ {m/2} } \mathop{\rm exp} \left [ \frac{- \sum _ {j} ( x _ {j} - \alpha _ {j} ) ^ {2} }{2} \right ] $$
with unit covariance matrix and quadratic loss function $ \sum _ {j} ( \delta _ {j} - \alpha _ {j} ) ^ {2} $, the optimal equivariant estimator is the ordinary sample mean
$$ \mathbf x ^ {*} = \frac{\mathbf x ^ {(} 1) + \dots + \mathbf x ^ {(} N) }{N} . $$
Here the group $ G $ is given by the product of the group $ S _ {N} $ of permutations of the observations and the group $ \mathop{\rm Ort} ( m ) $ of motions of the Euclidean space $ \mathbf R ^ {m} $; $ \overline{G}\; = \widehat{G} = \mathop{\rm Ort} ( m) $. For $ m \geq 3 $, there exist for this problem non-equivariant estimators leading to a smaller risk than for $ \mathbf x ^ {*} $ for all $ \pmb\alpha $; however, the region of essential "superefficiency" turns out to be insignificant and diminishes without bound as the size $ N $ of the sample increases. The possibility of superefficient procedures is connected with the non-compactness of $ G $.
Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions $ P $ of outcomes is essentially infinite-dimensional, as well as in the construction of confidence sets for the parameter $ \theta $ of the distribution in the presence of nuisance parameters.
References
[1] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |
Invariance of a statistical procedure. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Invariance_of_a_statistical_procedure&oldid=47409