Namespaces
Variants
Actions

Difference between revisions of "Least-favourable distribution"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
l0577601.png
 +
$#A+1 = 55 n = 0
 +
$#C+1 = 55 : ~/encyclopedia/old_files/data/L057/L.0507760 Least\AAhfavourable distribution
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
An [[A priori distribution|a priori distribution]] maximizing the risk function in a statistical problem of decision making.
 
An [[A priori distribution|a priori distribution]] maximizing the risk function in a statistical problem of decision making.
  
Suppose that, based on a realization of a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577601.png" /> with values in a sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577602.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577603.png" />, one has to choose a decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577604.png" /> from a decision space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577605.png" />; it is assumed here that the unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577606.png" /> is a random variable taking values in a sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577607.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577608.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l0577609.png" /> be a function representing the loss incurred by adopting the decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776010.png" /> if the true value of the parameter is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776011.png" />. An a priori distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776012.png" /> from the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776013.png" /> is said to be least favourable for a decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776014.png" /> in the statistical problem of decision making using the [[Bayesian approach|Bayesian approach]] if
+
Suppose that, based on a realization of a random variable $  X $
 +
with values in a sample space $  ( \mathfrak X , \mathfrak B _ {\mathfrak X} , P _  \theta  ) $,  
 +
$  \theta \in \Theta $,  
 +
one has to choose a decision $  d $
 +
from a decision space $  ( \mathfrak D , \mathfrak B _ {\mathfrak D} ) $;  
 +
it is assumed here that the unknown parameter $  \theta $
 +
is a random variable taking values in a sample space $  ( \Theta , \mathfrak B _  \Theta  , \pi _ {t} ) $,  
 +
$  t \in T $.  
 +
Let $  L( \theta , d) $
 +
be a function representing the loss incurred by adopting the decision $  d $
 +
if the true value of the parameter is $  \theta $.  
 +
An a priori distribution $  \pi _ {t  ^ {*}  } $
 +
from the family $  \{ {\pi _ {t} } : {t \in T } \} $
 +
is said to be least favourable for a decision $  d $
 +
in the statistical problem of decision making using the [[Bayesian approach|Bayesian approach]] if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776015.png" /></td> </tr></table>
+
$$
 +
\sup _ {t \in T }  \rho ( \pi _ {t} , d)  = \
 +
\rho ( \pi _ {t  ^ {*}  } , d),
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776016.png" /></td> </tr></table>
+
$$
 +
\rho ( \pi _ {t} , d)  = \
 +
\int\limits _  \Theta  \int\limits _ {\mathfrak X }
 +
L ( \theta , d ( x))  d P _  \theta  ( x)  d \pi _ {t} ( \theta )
 +
$$
  
is the risk function, representing the mean loss incurred by adopting the decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776017.png" />. A least-favourable distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776018.png" /> makes it possible to calculate the  "greatest"  (on the average) loss <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776019.png" /> incurred by adopting <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776020.png" />. In practical work one is guided, as a rule, not by the least-favourable distribution, but, on the contrary, strives to adopt a decision that would safeguard one against maximum loss when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776021.png" /> varies; this implies a search for a minimax decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776022.png" /> minimizing the maximum risk, i.e.
+
is the risk function, representing the mean loss incurred by adopting the decision $  d $.  
 +
A least-favourable distribution $  \pi _ {t  ^ {*}  } $
 +
makes it possible to calculate the  "greatest"  (on the average) loss $  \rho ( \pi _ {t  ^ {*}  } , d) $
 +
incurred by adopting $  d $.  
 +
In practical work one is guided, as a rule, not by the least-favourable distribution, but, on the contrary, strives to adopt a decision that would safeguard one against maximum loss when $  \theta $
 +
varies; this implies a search for a minimax decision $  d  ^ {*} $
 +
minimizing the maximum risk, i.e.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776023.png" /></td> </tr></table>
+
$$
 +
\inf _ {d \in \mathfrak D } \
 +
\sup _ {t \in T } \
 +
\rho ( \pi _ {t} , d)  = \
 +
\sup _ {t \in T } \
 +
\rho ( \pi _ {t} , d  ^ {*} ).
 +
$$
  
When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776024.png" />, one has to test a composite hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776025.png" />, according to which the distribution law of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776026.png" /> belongs to a family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776027.png" />, against a simple alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776028.png" />, according to which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776029.png" /> obeys a law <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776030.png" />; let
+
When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable $  X $,  
 +
one has to test a composite hypothesis $  H _ {0} $,  
 +
according to which the distribution law of $  X $
 +
belongs to a family $  H _ {0} = \{ {P _  \theta  } : {\theta \in \Theta } \} $,  
 +
against a simple alternative $  H _ {1} $,  
 +
according to which $  X $
 +
obeys a law $  Q $;  
 +
let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776031.png" /></td> </tr></table>
+
$$
 +
p _  \theta  ( x)  =
 +
\frac{dP _  \theta  ( x) }{d \mu ( x) }
 +
\ \
 +
\textrm{ and } \ \
 +
q ( x)  =
 +
\frac{dQ ( x) }{d \mu ( x) }
 +
,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776032.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776033.png" />-finite measure on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776034.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776035.png" /> is a family of a priori distributions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776036.png" />. Then, for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776037.png" />, the composite hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776038.png" /> can be associated with a simple hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776039.png" />, according to which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776040.png" /> obeys the probability law with density
+
where $  \mu ( \cdot ) $
 +
is a $  \sigma $-
 +
finite measure on $  ( \mathfrak X , \mathfrak B _ {\mathfrak X} ) $
 +
and $  \{ {\pi _ {t} } : {t \in T } \} $
 +
is a family of a priori distributions on $  ( \Theta , \mathfrak B _  \Theta  ) $.  
 +
Then, for any $  t \in T $,  
 +
the composite hypothesis $  H _ {0} $
 +
can be associated with a simple hypothesis $  H _ {t} $,  
 +
according to which $  X $
 +
obeys the probability law with density
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776041.png" /></td> </tr></table>
+
$$
 +
f _ {t} ( x)  = \
 +
\int\limits _  \Theta  p _  \theta  ( x)  d \pi _ {t} ( \theta ).
 +
$$
  
By the [[Neyman–Pearson lemma|Neyman–Pearson lemma]] for testing a simple hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776042.png" /> against a simple alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776043.png" />, there exists a [[Most-powerful test|most-powerful test]], based on the likelihood ratio. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776044.png" /> be the power of this test (cf. [[Power of a statistical test|Power of a statistical test]]). Then the least-favourable distribution is the a priori distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776045.png" /> from the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776046.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776047.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776048.png" />. The least-favourable distribution has the property that the density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776049.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776050.png" /> under the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776051.png" /> is the  "least distant"  from the alternative density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776052.png" />, i.e. the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776053.png" /> is the member of the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776054.png" />  "nearest"  to the rival hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/l/l057/l057760/l05776055.png" />. See [[Bayesian approach|Bayesian approach]].
+
By the [[Neyman–Pearson lemma|Neyman–Pearson lemma]] for testing a simple hypothesis $  H _ {t} $
 +
against a simple alternative $  H _ {1} $,  
 +
there exists a [[Most-powerful test|most-powerful test]], based on the likelihood ratio. Let $  \beta _ {t} $
 +
be the power of this test (cf. [[Power of a statistical test|Power of a statistical test]]). Then the least-favourable distribution is the a priori distribution $  \pi _ {t  ^ {*}  } $
 +
from the family $  \{ {\pi _ {t} } : {t \in T } \} $
 +
such that $  \beta _ {t  ^ {*}  } \leq  \beta _ {t} $
 +
for all $  t \in T $.  
 +
The least-favourable distribution has the property that the density $  f _ {t  ^ {*}  } ( x) $
 +
of $  X $
 +
under the hypothesis $  H _ {t  ^ {*}  } $
 +
is the  "least distant"  from the alternative density $  q ( x) $,  
 +
i.e. the hypothesis $  H _ {t  ^ {*}  } $
 +
is the member of the family $  \{ {H _ {t} } : {t \in T } \} $"
 +
nearest"  to the rival hypothesis $  H _ {1} $.  
 +
See [[Bayesian approach|Bayesian approach]].
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  S. Zachs,  "Theory of statistical inference" , Wiley  (1971)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  S. Zachs,  "Theory of statistical inference" , Wiley  (1971)</TD></TR></table>

Latest revision as of 22:16, 5 June 2020


An a priori distribution maximizing the risk function in a statistical problem of decision making.

Suppose that, based on a realization of a random variable $ X $ with values in a sample space $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} , P _ \theta ) $, $ \theta \in \Theta $, one has to choose a decision $ d $ from a decision space $ ( \mathfrak D , \mathfrak B _ {\mathfrak D} ) $; it is assumed here that the unknown parameter $ \theta $ is a random variable taking values in a sample space $ ( \Theta , \mathfrak B _ \Theta , \pi _ {t} ) $, $ t \in T $. Let $ L( \theta , d) $ be a function representing the loss incurred by adopting the decision $ d $ if the true value of the parameter is $ \theta $. An a priori distribution $ \pi _ {t ^ {*} } $ from the family $ \{ {\pi _ {t} } : {t \in T } \} $ is said to be least favourable for a decision $ d $ in the statistical problem of decision making using the Bayesian approach if

$$ \sup _ {t \in T } \rho ( \pi _ {t} , d) = \ \rho ( \pi _ {t ^ {*} } , d), $$

where

$$ \rho ( \pi _ {t} , d) = \ \int\limits _ \Theta \int\limits _ {\mathfrak X } L ( \theta , d ( x)) d P _ \theta ( x) d \pi _ {t} ( \theta ) $$

is the risk function, representing the mean loss incurred by adopting the decision $ d $. A least-favourable distribution $ \pi _ {t ^ {*} } $ makes it possible to calculate the "greatest" (on the average) loss $ \rho ( \pi _ {t ^ {*} } , d) $ incurred by adopting $ d $. In practical work one is guided, as a rule, not by the least-favourable distribution, but, on the contrary, strives to adopt a decision that would safeguard one against maximum loss when $ \theta $ varies; this implies a search for a minimax decision $ d ^ {*} $ minimizing the maximum risk, i.e.

$$ \inf _ {d \in \mathfrak D } \ \sup _ {t \in T } \ \rho ( \pi _ {t} , d) = \ \sup _ {t \in T } \ \rho ( \pi _ {t} , d ^ {*} ). $$

When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable $ X $, one has to test a composite hypothesis $ H _ {0} $, according to which the distribution law of $ X $ belongs to a family $ H _ {0} = \{ {P _ \theta } : {\theta \in \Theta } \} $, against a simple alternative $ H _ {1} $, according to which $ X $ obeys a law $ Q $; let

$$ p _ \theta ( x) = \frac{dP _ \theta ( x) }{d \mu ( x) } \ \ \textrm{ and } \ \ q ( x) = \frac{dQ ( x) }{d \mu ( x) } , $$

where $ \mu ( \cdot ) $ is a $ \sigma $- finite measure on $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} ) $ and $ \{ {\pi _ {t} } : {t \in T } \} $ is a family of a priori distributions on $ ( \Theta , \mathfrak B _ \Theta ) $. Then, for any $ t \in T $, the composite hypothesis $ H _ {0} $ can be associated with a simple hypothesis $ H _ {t} $, according to which $ X $ obeys the probability law with density

$$ f _ {t} ( x) = \ \int\limits _ \Theta p _ \theta ( x) d \pi _ {t} ( \theta ). $$

By the Neyman–Pearson lemma for testing a simple hypothesis $ H _ {t} $ against a simple alternative $ H _ {1} $, there exists a most-powerful test, based on the likelihood ratio. Let $ \beta _ {t} $ be the power of this test (cf. Power of a statistical test). Then the least-favourable distribution is the a priori distribution $ \pi _ {t ^ {*} } $ from the family $ \{ {\pi _ {t} } : {t \in T } \} $ such that $ \beta _ {t ^ {*} } \leq \beta _ {t} $ for all $ t \in T $. The least-favourable distribution has the property that the density $ f _ {t ^ {*} } ( x) $ of $ X $ under the hypothesis $ H _ {t ^ {*} } $ is the "least distant" from the alternative density $ q ( x) $, i.e. the hypothesis $ H _ {t ^ {*} } $ is the member of the family $ \{ {H _ {t} } : {t \in T } \} $" nearest" to the rival hypothesis $ H _ {1} $. See Bayesian approach.

References

[1] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)
[2] S. Zachs, "Theory of statistical inference" , Wiley (1971)
How to Cite This Entry:
Least-favourable distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Least-favourable_distribution&oldid=47598
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article