Difference between revisions of "Least-favourable distribution"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | l0577601.png | ||
+ | $#A+1 = 55 n = 0 | ||
+ | $#C+1 = 55 : ~/encyclopedia/old_files/data/L057/L.0507760 Least\AAhfavourable distribution | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
An [[A priori distribution|a priori distribution]] maximizing the risk function in a statistical problem of decision making. | An [[A priori distribution|a priori distribution]] maximizing the risk function in a statistical problem of decision making. | ||
− | Suppose that, based on a realization of a random variable | + | Suppose that, based on a realization of a random variable $ X $ |
+ | with values in a sample space $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} , P _ \theta ) $, | ||
+ | $ \theta \in \Theta $, | ||
+ | one has to choose a decision $ d $ | ||
+ | from a decision space $ ( \mathfrak D , \mathfrak B _ {\mathfrak D} ) $; | ||
+ | it is assumed here that the unknown parameter $ \theta $ | ||
+ | is a random variable taking values in a sample space $ ( \Theta , \mathfrak B _ \Theta , \pi _ {t} ) $, | ||
+ | $ t \in T $. | ||
+ | Let $ L( \theta , d) $ | ||
+ | be a function representing the loss incurred by adopting the decision $ d $ | ||
+ | if the true value of the parameter is $ \theta $. | ||
+ | An a priori distribution $ \pi _ {t ^ {*} } $ | ||
+ | from the family $ \{ {\pi _ {t} } : {t \in T } \} $ | ||
+ | is said to be least favourable for a decision $ d $ | ||
+ | in the statistical problem of decision making using the [[Bayesian approach|Bayesian approach]] if | ||
− | + | $$ | |
+ | \sup _ {t \in T } \rho ( \pi _ {t} , d) = \ | ||
+ | \rho ( \pi _ {t ^ {*} } , d), | ||
+ | $$ | ||
where | where | ||
− | + | $$ | |
+ | \rho ( \pi _ {t} , d) = \ | ||
+ | \int\limits _ \Theta \int\limits _ {\mathfrak X } | ||
+ | L ( \theta , d ( x)) d P _ \theta ( x) d \pi _ {t} ( \theta ) | ||
+ | $$ | ||
− | is the risk function, representing the mean loss incurred by adopting the decision | + | is the risk function, representing the mean loss incurred by adopting the decision $ d $. |
+ | A least-favourable distribution $ \pi _ {t ^ {*} } $ | ||
+ | makes it possible to calculate the "greatest" (on the average) loss $ \rho ( \pi _ {t ^ {*} } , d) $ | ||
+ | incurred by adopting $ d $. | ||
+ | In practical work one is guided, as a rule, not by the least-favourable distribution, but, on the contrary, strives to adopt a decision that would safeguard one against maximum loss when $ \theta $ | ||
+ | varies; this implies a search for a minimax decision $ d ^ {*} $ | ||
+ | minimizing the maximum risk, i.e. | ||
− | + | $$ | |
+ | \inf _ {d \in \mathfrak D } \ | ||
+ | \sup _ {t \in T } \ | ||
+ | \rho ( \pi _ {t} , d) = \ | ||
+ | \sup _ {t \in T } \ | ||
+ | \rho ( \pi _ {t} , d ^ {*} ). | ||
+ | $$ | ||
− | When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable | + | When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable $ X $, |
+ | one has to test a composite hypothesis $ H _ {0} $, | ||
+ | according to which the distribution law of $ X $ | ||
+ | belongs to a family $ H _ {0} = \{ {P _ \theta } : {\theta \in \Theta } \} $, | ||
+ | against a simple alternative $ H _ {1} $, | ||
+ | according to which $ X $ | ||
+ | obeys a law $ Q $; | ||
+ | let | ||
− | + | $$ | |
+ | p _ \theta ( x) = | ||
+ | \frac{dP _ \theta ( x) }{d \mu ( x) } | ||
+ | \ \ | ||
+ | \textrm{ and } \ \ | ||
+ | q ( x) = | ||
+ | \frac{dQ ( x) }{d \mu ( x) } | ||
+ | , | ||
+ | $$ | ||
− | where | + | where $ \mu ( \cdot ) $ |
+ | is a $ \sigma $- | ||
+ | finite measure on $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} ) $ | ||
+ | and $ \{ {\pi _ {t} } : {t \in T } \} $ | ||
+ | is a family of a priori distributions on $ ( \Theta , \mathfrak B _ \Theta ) $. | ||
+ | Then, for any $ t \in T $, | ||
+ | the composite hypothesis $ H _ {0} $ | ||
+ | can be associated with a simple hypothesis $ H _ {t} $, | ||
+ | according to which $ X $ | ||
+ | obeys the probability law with density | ||
− | + | $$ | |
+ | f _ {t} ( x) = \ | ||
+ | \int\limits _ \Theta p _ \theta ( x) d \pi _ {t} ( \theta ). | ||
+ | $$ | ||
− | By the [[Neyman–Pearson lemma|Neyman–Pearson lemma]] for testing a simple hypothesis | + | By the [[Neyman–Pearson lemma|Neyman–Pearson lemma]] for testing a simple hypothesis $ H _ {t} $ |
+ | against a simple alternative $ H _ {1} $, | ||
+ | there exists a [[Most-powerful test|most-powerful test]], based on the likelihood ratio. Let $ \beta _ {t} $ | ||
+ | be the power of this test (cf. [[Power of a statistical test|Power of a statistical test]]). Then the least-favourable distribution is the a priori distribution $ \pi _ {t ^ {*} } $ | ||
+ | from the family $ \{ {\pi _ {t} } : {t \in T } \} $ | ||
+ | such that $ \beta _ {t ^ {*} } \leq \beta _ {t} $ | ||
+ | for all $ t \in T $. | ||
+ | The least-favourable distribution has the property that the density $ f _ {t ^ {*} } ( x) $ | ||
+ | of $ X $ | ||
+ | under the hypothesis $ H _ {t ^ {*} } $ | ||
+ | is the "least distant" from the alternative density $ q ( x) $, | ||
+ | i.e. the hypothesis $ H _ {t ^ {*} } $ | ||
+ | is the member of the family $ \{ {H _ {t} } : {t \in T } \} $" | ||
+ | nearest" to the rival hypothesis $ H _ {1} $. | ||
+ | See [[Bayesian approach|Bayesian approach]]. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> S. Zachs, "Theory of statistical inference" , Wiley (1971)</TD></TR></table> | <table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> S. Zachs, "Theory of statistical inference" , Wiley (1971)</TD></TR></table> |
Latest revision as of 22:16, 5 June 2020
An a priori distribution maximizing the risk function in a statistical problem of decision making.
Suppose that, based on a realization of a random variable $ X $ with values in a sample space $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} , P _ \theta ) $, $ \theta \in \Theta $, one has to choose a decision $ d $ from a decision space $ ( \mathfrak D , \mathfrak B _ {\mathfrak D} ) $; it is assumed here that the unknown parameter $ \theta $ is a random variable taking values in a sample space $ ( \Theta , \mathfrak B _ \Theta , \pi _ {t} ) $, $ t \in T $. Let $ L( \theta , d) $ be a function representing the loss incurred by adopting the decision $ d $ if the true value of the parameter is $ \theta $. An a priori distribution $ \pi _ {t ^ {*} } $ from the family $ \{ {\pi _ {t} } : {t \in T } \} $ is said to be least favourable for a decision $ d $ in the statistical problem of decision making using the Bayesian approach if
$$ \sup _ {t \in T } \rho ( \pi _ {t} , d) = \ \rho ( \pi _ {t ^ {*} } , d), $$
where
$$ \rho ( \pi _ {t} , d) = \ \int\limits _ \Theta \int\limits _ {\mathfrak X } L ( \theta , d ( x)) d P _ \theta ( x) d \pi _ {t} ( \theta ) $$
is the risk function, representing the mean loss incurred by adopting the decision $ d $. A least-favourable distribution $ \pi _ {t ^ {*} } $ makes it possible to calculate the "greatest" (on the average) loss $ \rho ( \pi _ {t ^ {*} } , d) $ incurred by adopting $ d $. In practical work one is guided, as a rule, not by the least-favourable distribution, but, on the contrary, strives to adopt a decision that would safeguard one against maximum loss when $ \theta $ varies; this implies a search for a minimax decision $ d ^ {*} $ minimizing the maximum risk, i.e.
$$ \inf _ {d \in \mathfrak D } \ \sup _ {t \in T } \ \rho ( \pi _ {t} , d) = \ \sup _ {t \in T } \ \rho ( \pi _ {t} , d ^ {*} ). $$
When testing a composite statistical hypothesis against a simple alternative, within the Bayesian approach, one defines a least-favourable distribution with the aid of Wald reduction, which may be described as follows. Suppose that, based on a realization of a random variable $ X $, one has to test a composite hypothesis $ H _ {0} $, according to which the distribution law of $ X $ belongs to a family $ H _ {0} = \{ {P _ \theta } : {\theta \in \Theta } \} $, against a simple alternative $ H _ {1} $, according to which $ X $ obeys a law $ Q $; let
$$ p _ \theta ( x) = \frac{dP _ \theta ( x) }{d \mu ( x) } \ \ \textrm{ and } \ \ q ( x) = \frac{dQ ( x) }{d \mu ( x) } , $$
where $ \mu ( \cdot ) $ is a $ \sigma $- finite measure on $ ( \mathfrak X , \mathfrak B _ {\mathfrak X} ) $ and $ \{ {\pi _ {t} } : {t \in T } \} $ is a family of a priori distributions on $ ( \Theta , \mathfrak B _ \Theta ) $. Then, for any $ t \in T $, the composite hypothesis $ H _ {0} $ can be associated with a simple hypothesis $ H _ {t} $, according to which $ X $ obeys the probability law with density
$$ f _ {t} ( x) = \ \int\limits _ \Theta p _ \theta ( x) d \pi _ {t} ( \theta ). $$
By the Neyman–Pearson lemma for testing a simple hypothesis $ H _ {t} $ against a simple alternative $ H _ {1} $, there exists a most-powerful test, based on the likelihood ratio. Let $ \beta _ {t} $ be the power of this test (cf. Power of a statistical test). Then the least-favourable distribution is the a priori distribution $ \pi _ {t ^ {*} } $ from the family $ \{ {\pi _ {t} } : {t \in T } \} $ such that $ \beta _ {t ^ {*} } \leq \beta _ {t} $ for all $ t \in T $. The least-favourable distribution has the property that the density $ f _ {t ^ {*} } ( x) $ of $ X $ under the hypothesis $ H _ {t ^ {*} } $ is the "least distant" from the alternative density $ q ( x) $, i.e. the hypothesis $ H _ {t ^ {*} } $ is the member of the family $ \{ {H _ {t} } : {t \in T } \} $" nearest" to the rival hypothesis $ H _ {1} $. See Bayesian approach.
References
[1] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |
[2] | S. Zachs, "Theory of statistical inference" , Wiley (1971) |
Least-favourable distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Least-favourable_distribution&oldid=47598