Namespaces
Variants
Actions

Difference between revisions of "Invariance of a statistical procedure"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
Line 1: Line 1:
The equivariance (see below) of some decision rule in a statistical problem, the statement of which admits of a group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521801.png" /> of symmetries, under this group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521802.png" />. The notion of invariance of a statistical procedure arises in the first instance in so-called parametric problems of mathematical statistics, when there is a priori information: the probability distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521803.png" /> of the outcomes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521804.png" /> of an observation belongs to a known family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521805.png" />. A statistical decision problem is said to be <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521807.png" />-equivariant under a group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521808.png" /> of measurable transformations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i0521809.png" /> of a measurable space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218010.png" /> of outcomes if the following conditions hold: 1) there is a homomorphism <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218011.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218012.png" /> onto a group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218013.png" /> of transformations of the parameter space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218014.png" />,
+
<!--
 +
i0521801.png
 +
$#A+1 = 55 n = 0
 +
$#C+1 = 55 : ~/encyclopedia/old_files/data/I052/I.0502180 Invariance of a statistical procedure
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218015.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
 +
The equivariance (see below) of some decision rule in a statistical problem, the statement of which admits of a group  $  G $
 +
of symmetries, under this group  $  G $.
 +
The notion of invariance of a statistical procedure arises in the first instance in so-called parametric problems of mathematical statistics, when there is a priori information: the probability distribution  $  P ( d \omega ) $
 +
of the outcomes  $  \omega $
 +
of an observation belongs to a known family  $  \{ {P _  \theta  } : {\theta \in \Theta } \} $.
 +
A statistical decision problem is said to be  $  G $-
 +
equivariant under a group  $  G $
 +
of measurable transformations  $  g $
 +
of a measurable space  $  ( \Omega , B _  \Omega  ) $
 +
of outcomes if the following conditions hold: 1) there is a homomorphism  $  f $
 +
of  $  G $
 +
onto a group  $  \overline{G}\; $
 +
of transformations of the parameter space  $  \Theta $,
 +
 
 +
$$
 +
f : g  \rightarrow  \overline{g}\;  \in  \overline{G}\; ,\  \forall g \in G ,
 +
$$
  
 
with the property
 
with the property
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218016.png" /></td> </tr></table>
+
$$
 +
( P _  \theta  g ) ( \cdot )  = P _ {\overline{g}\; ( \theta ) }  ( \cdot ) ,\ \
 +
\forall g \in G ;
 +
$$
  
2) there exists a homomorphism <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218017.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218018.png" /> onto a group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218019.png" /> of measurable transformations of a measurable space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218020.png" /> of decisions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218021.png" />,
+
2) there exists a homomorphism $  h $
 +
of $  G $
 +
onto a group $  \widehat{G}  $
 +
of measurable transformations of a measurable space $  ( D , B _ {D} ) $
 +
of decisions $  d $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218022.png" /></td> </tr></table>
+
$$
 +
h : g  \rightarrow  \widehat{g}  \in  \widehat{G}  ,\  \forall g \in G ,
 +
$$
  
 
with the property
 
with the property
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218023.png" /></td> </tr></table>
+
$$
 +
L ( \overline{g}\; ( \theta ) , \widehat{g}  ( d ) )  = L ( \theta , d ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218024.png" /> is the loss function; and 3) all the additional a priori information on the possible values of the parameter (the a priori density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218025.png" />, the subdivision into alternatives <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218026.png" />, etc.) is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218027.png" />-invariant or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218028.png" />-equivariant. Under these conditions, the decision rule <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218029.png" />, whether deterministic or random, is called an invariant (more precisely, a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218031.png" />-equivariant) procedure if
+
where $  L ( \theta , d ) $
 +
is the loss function; and 3) all the additional a priori information on the possible values of the parameter (the a priori density $  p ( \theta ) $,  
 +
the subdivision into alternatives $  \Theta = \Theta _ {1} \cup \dots \cup \Theta _ {s} $,  
 +
etc.) is $  G $-
 +
invariant or $  G $-
 +
equivariant. Under these conditions, the decision rule $  \delta : \omega \rightarrow \delta ( \omega ) \in D $,  
 +
whether deterministic or random, is called an invariant (more precisely, a $  G $-
 +
equivariant) procedure if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218033.png" /></td> </tr></table>
+
$$
 +
\delta ( g ( \omega ) )  = \widehat{g}  ( \delta ( \omega ) ) ,\ \
 +
\forall \omega \in \Omega ,\  \forall g \in G .
 +
$$
  
 
The risk
 
The risk
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218034.png" /></td> </tr></table>
+
$$
 +
r _  \delta  ( \theta )  = {\mathsf E} _  \theta  L ( \theta , \delta (
 +
\omega ) )
 +
$$
  
of an equivariant decision procedure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218035.png" /> is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218036.png" />-invariant; in particular, it does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218037.png" /> if the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218038.png" /> acts transitively on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218039.png" />.
+
of an equivariant decision procedure $  \delta $
 +
is $  G $-
 +
invariant; in particular, it does not depend on $  \theta $
 +
if the group $  G $
 +
acts transitively on $  \Theta $.
  
In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218040.png" />. In particular, a procedure may lead to very small values of the risk for certain values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218041.png" /> at the expense of worsening the quality for other equally-possible a priori values of the parameter. Equivariance guarantees to some extent that the approach is unbiased. When the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218042.png" /> is sufficiently rich, there is an optimal invariant procedure with a uniformly minimal risk among the invariant procedures.
+
In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter $  \theta \in \Theta $.  
 +
In particular, a procedure may lead to very small values of the risk for certain values of $  \theta $
 +
at the expense of worsening the quality for other equally-possible a priori values of the parameter. Equivariance guarantees to some extent that the approach is unbiased. When the group $  G $
 +
is sufficiently rich, there is an optimal invariant procedure with a uniformly minimal risk among the invariant procedures.
  
Invariant procedures are widely applied in hypotheses testing (see also [[Invariant test|Invariant test]]) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218043.png" />-dimensional normal distributions
+
Invariant procedures are widely applied in hypotheses testing (see also [[Invariant test|Invariant test]]) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of $  m $-
 +
dimensional normal distributions
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218044.png" /></td> </tr></table>
+
$$
 +
p ( \mathbf x , \pmb\alpha )  =
 +
\frac{1}{( 2 \pi )  ^ {m/2} }
 +
  \mathop{\rm exp}
 +
\left [
 +
\frac{- \sum _ {j} ( x _ {j} - \alpha _ {j} )  ^ {2} }{2}
 +
\right ]
 +
$$
  
with unit covariance matrix and quadratic loss function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218045.png" />, the optimal equivariant estimator is the ordinary sample mean
+
with unit covariance matrix and quadratic loss function $  \sum _ {j} ( \delta _ {j} - \alpha _ {j} )  ^ {2} $,  
 +
the optimal equivariant estimator is the ordinary sample mean
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218046.png" /></td> </tr></table>
+
$$
 +
\mathbf x  ^ {*}  =
 +
\frac{\mathbf x  ^ {(} 1) + \dots + \mathbf x  ^ {(} N) }{N}
 +
.
 +
$$
  
Here the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218047.png" /> is given by the product of the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218048.png" /> of permutations of the observations and the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218049.png" /> of motions of the Euclidean space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218050.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218051.png" />. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218052.png" />, there exist for this problem non-equivariant estimators leading to a smaller risk than for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218053.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218054.png" />; however, the region of essential  "superefficiency"  turns out to be insignificant and diminishes without bound as the size <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218055.png" /> of the sample increases. The possibility of superefficient procedures is connected with the non-compactness of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218056.png" />.
+
Here the group $  G $
 +
is given by the product of the group $  S _ {N} $
 +
of permutations of the observations and the group $  \mathop{\rm Ort} ( m ) $
 +
of motions of the Euclidean space $  \mathbf R  ^ {m} $;
 +
$  \overline{G}\; = \widehat{G}  = \mathop{\rm Ort} ( m) $.  
 +
For $  m \geq  3 $,  
 +
there exist for this problem non-equivariant estimators leading to a smaller risk than for $  \mathbf x  ^ {*} $
 +
for all $  \pmb\alpha $;  
 +
however, the region of essential  "superefficiency"  turns out to be insignificant and diminishes without bound as the size $  N $
 +
of the sample increases. The possibility of superefficient procedures is connected with the non-compactness of $  G $.
  
Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218057.png" /> of outcomes is essentially infinite-dimensional, as well as in the construction of confidence sets for the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i052/i052180/i05218058.png" /> of the distribution in the presence of nuisance parameters.
+
Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions $  P $
 +
of outcomes is essentially infinite-dimensional, as well as in the construction of confidence sets for the parameter $  \theta $
 +
of the distribution in the presence of nuisance parameters.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR></table>

Revision as of 22:13, 5 June 2020


The equivariance (see below) of some decision rule in a statistical problem, the statement of which admits of a group $ G $ of symmetries, under this group $ G $. The notion of invariance of a statistical procedure arises in the first instance in so-called parametric problems of mathematical statistics, when there is a priori information: the probability distribution $ P ( d \omega ) $ of the outcomes $ \omega $ of an observation belongs to a known family $ \{ {P _ \theta } : {\theta \in \Theta } \} $. A statistical decision problem is said to be $ G $- equivariant under a group $ G $ of measurable transformations $ g $ of a measurable space $ ( \Omega , B _ \Omega ) $ of outcomes if the following conditions hold: 1) there is a homomorphism $ f $ of $ G $ onto a group $ \overline{G}\; $ of transformations of the parameter space $ \Theta $,

$$ f : g \rightarrow \overline{g}\; \in \overline{G}\; ,\ \forall g \in G , $$

with the property

$$ ( P _ \theta g ) ( \cdot ) = P _ {\overline{g}\; ( \theta ) } ( \cdot ) ,\ \ \forall g \in G ; $$

2) there exists a homomorphism $ h $ of $ G $ onto a group $ \widehat{G} $ of measurable transformations of a measurable space $ ( D , B _ {D} ) $ of decisions $ d $,

$$ h : g \rightarrow \widehat{g} \in \widehat{G} ,\ \forall g \in G , $$

with the property

$$ L ( \overline{g}\; ( \theta ) , \widehat{g} ( d ) ) = L ( \theta , d ) , $$

where $ L ( \theta , d ) $ is the loss function; and 3) all the additional a priori information on the possible values of the parameter (the a priori density $ p ( \theta ) $, the subdivision into alternatives $ \Theta = \Theta _ {1} \cup \dots \cup \Theta _ {s} $, etc.) is $ G $- invariant or $ G $- equivariant. Under these conditions, the decision rule $ \delta : \omega \rightarrow \delta ( \omega ) \in D $, whether deterministic or random, is called an invariant (more precisely, a $ G $- equivariant) procedure if

$$ \delta ( g ( \omega ) ) = \widehat{g} ( \delta ( \omega ) ) ,\ \ \forall \omega \in \Omega ,\ \forall g \in G . $$

The risk

$$ r _ \delta ( \theta ) = {\mathsf E} _ \theta L ( \theta , \delta ( \omega ) ) $$

of an equivariant decision procedure $ \delta $ is $ G $- invariant; in particular, it does not depend on $ \theta $ if the group $ G $ acts transitively on $ \Theta $.

In parametric problems there is, in general, no guaranteed optimal decision procedure which minimizes the risk for each value of the parameter $ \theta \in \Theta $. In particular, a procedure may lead to very small values of the risk for certain values of $ \theta $ at the expense of worsening the quality for other equally-possible a priori values of the parameter. Equivariance guarantees to some extent that the approach is unbiased. When the group $ G $ is sufficiently rich, there is an optimal invariant procedure with a uniformly minimal risk among the invariant procedures.

Invariant procedures are widely applied in hypotheses testing (see also Invariant test) and in the estimation of the parameters of a probability distribution. Thus, in the problem of estimating an unknown vector of means for the family of $ m $- dimensional normal distributions

$$ p ( \mathbf x , \pmb\alpha ) = \frac{1}{( 2 \pi ) ^ {m/2} } \mathop{\rm exp} \left [ \frac{- \sum _ {j} ( x _ {j} - \alpha _ {j} ) ^ {2} }{2} \right ] $$

with unit covariance matrix and quadratic loss function $ \sum _ {j} ( \delta _ {j} - \alpha _ {j} ) ^ {2} $, the optimal equivariant estimator is the ordinary sample mean

$$ \mathbf x ^ {*} = \frac{\mathbf x ^ {(} 1) + \dots + \mathbf x ^ {(} N) }{N} . $$

Here the group $ G $ is given by the product of the group $ S _ {N} $ of permutations of the observations and the group $ \mathop{\rm Ort} ( m ) $ of motions of the Euclidean space $ \mathbf R ^ {m} $; $ \overline{G}\; = \widehat{G} = \mathop{\rm Ort} ( m) $. For $ m \geq 3 $, there exist for this problem non-equivariant estimators leading to a smaller risk than for $ \mathbf x ^ {*} $ for all $ \pmb\alpha $; however, the region of essential "superefficiency" turns out to be insignificant and diminishes without bound as the size $ N $ of the sample increases. The possibility of superefficient procedures is connected with the non-compactness of $ G $.

Equivariant statistical procedures also arise in a number of non-parametric statistical problems, when the a priori family of distributions $ P $ of outcomes is essentially infinite-dimensional, as well as in the construction of confidence sets for the parameter $ \theta $ of the distribution in the presence of nuisance parameters.

References

[1] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)
How to Cite This Entry:
Invariance of a statistical procedure. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Invariance_of_a_statistical_procedure&oldid=47409
This article was adapted from an original article by N.N. Chentsov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article