Difference between revisions of "Bayesian approach"

Latest revision as of 11:31, 10 February 2020

to statistical problems

An approach based on the assumption that to any parameter in a statistical problem there can be assigned a definite probability distribution. Any general statistical decision problem is determined by the following elements: by a space $ (X,\ {\mathcal B} _ {X} ) $ of (potential) samples $ x $, by a space $ ( \Theta ,\ {\mathcal B} _ \Theta ) $ of values of the unknown parameter $ \theta $, by a family of probability distributions $ \{ { {\mathsf P} _ \theta } : {\theta \in \Theta } \} $ on $ (X,\ {\mathcal B} _ {X} ) $, by a space of decisions $ (D,\ {\mathcal B} _ {D} ) $ and by a function $ L( \theta ,\ d) $, which characterizes the losses caused by accepting the decision $ d $ when the true value of the parameter is $ \theta $. The objective of decision making is to find in a certain sense an optimal rule (decision function) $ \delta = \delta (x) $, assigning to each result of an observation $ x \in X $ the decision $ \delta (x) \in D $. In the Bayesian approach, when it is assumed that the unknown parameter $ \theta $ is a random variable with a given (a priori) distribution $ \pi = \pi (d \theta ) $ on $ ( \Theta ,\ {\mathcal B} _ \Theta ) $ the best decision function (Bayesian decision function) $ {\delta ^ {*} } = {\delta ^ {*} } (x) $ is defined as the function for which the minimum expected loss $ \inf _ \delta \ \rho ( \pi ,\ \delta ) $, where

$$ \rho ( \pi ,\ \delta ) \ = \ \int\limits _ \Theta \rho ( \theta ,\ \delta ) \ \pi (d \theta ) , $$

and

$$ \rho ( \theta ,\ \delta ) \ = \ \int\limits _ { X } L ( \theta ,\ \delta (x)) \ {\mathsf P} _ \theta (dx) $$

is attained. Thus,

$$ \rho ( \pi ,\ \delta ^ {*} ) \ = \ \inf _ \delta \ \int\limits _ \Theta \int\limits _ { X } L ( \theta ,\ \delta (x)) \ {\mathsf P} _ \theta (dx) \ \pi ( d \theta ) . $$

In searching for the Bayesian decision function $ \delta ^ {*} = \delta ^ {*} (x) $, the following remark is useful. Let $ {\mathsf P} _ \theta (dx) = p (x \mid \theta ) \ d \mu (x) $, $ \pi (d \theta ) = \pi ( \theta ) \ d \nu ( \theta ) $, where $ \mu $ and $ \nu $ are certain $ \sigma $- finite measures. One then finds, assuming that the order of integration may be changed,

$$ \int\limits _ \Theta \int\limits _ { X } L ( \theta ,\ \delta (x)) \ {\mathsf P} _ \theta (dx ) \ \pi ( d \theta )\ = $$

$$ = \ \int\limits _ \Theta \int\limits _ { X } L ( \theta ,\ \delta (x)) p ( x \mid \theta ) \pi ( \theta ) \ d \mu (x) \ d \nu ( \theta )\ = $$

$$ = \ \int\limits _ { X } \ d \mu (x) \left [ \int\limits _ \Theta L ( \theta ,\ \delta (x)) p (x \mid \theta ) \pi ( \theta ) \ d \nu ( \theta ) \right ] . $$

It is seen from the above that for a given $ x \in X ,\ \delta ^ {*} (x) $ is that value of $ d ^ {*} $ for which

$$ \inf _ { d } \ \int\limits _ \Theta L ( \theta ,\ d) p (x \mid \theta ) \pi ( \theta ) \ d \nu ( \theta ) $$

is attained, or, what is equivalent, for which

$$ \inf _ { d } \ \int\limits _ \Theta L ( \theta ,\ d) \frac{p (x \mid \theta ) \pi ( \theta ) }{p (x) } \ d \nu ( \theta ) , $$

is attained, where

$$ p (x) \ = \ \int\limits _ \Theta p (x \mid \theta ) \pi ( \theta ) \ d \nu ( \theta ) . $$

But, according to the Bayes formula

$$ \int\limits _ \Theta L( \theta ,\ d) \frac{p (x \mid \theta ) \pi ( \theta ) }{p (x) } \ d \nu ( \theta ) \ = \ {\mathsf E} [L ( \theta ,\ d) \mid x]. $$

Thus, for a given $ x $, $ \delta ^ {*} (x) $ is that value of $ d ^ {*} $ for which the conditional average loss $ {\mathsf E} [L ( \theta ,\ d) \mid x] $ attains a minimum.

Example. (The Bayesian approach applied to the case of distinguishing between two simple hypotheses.) Let $ \Theta = \{ \theta _ {1} ,\ \theta _ {2} \} $, $ D = \{ d _ {1} ,\ d _ {2} \} $, $ L _ {ij } = L = ( \theta _ {i} ,\ d _ {j} ) $, $ i,\ j = 1,\ 2 $; $ \pi ( \theta _ {1} ) = \pi _ {1} $, $ \pi ( \theta _ {2} ) = \pi _ {2} $, $ \pi _ {1} + \pi _ {2} = 1 $. If the solution $ d _ {i} $ is identified with the acceptance of the hypothesis $ H _ {i} $: $ \theta = \theta _ {i} $, it is natural to assume that $ L _ {11} < L _ {12} $, $ L _ {22} < L _ {21} $. Then

$$ \rho ( \pi ,\ \delta ) \ = \ \int\limits _ { X } [ \pi _ {1} p (x \mid \theta _ {1} ) L ( \theta _ {1} ,\ \delta ( x)) + $$

$$ + {} \pi _ {2} p (x \mid \theta _ {2} ) L ( \theta _ {2} ,\ \delta (x))] \ d \mu (x) $$

implies that $ \inf _ \delta \ \rho ( \pi ,\ \delta ) $ is attained for the function

$$ \delta ^ {*} (x) \ = \ \left \{ \begin{array}{l} d _ {1} ,\ \ \textrm{ if } \ \frac{p (x \mid \theta _ {2} ) }{p (x \mid \theta _ {1} ) } \ \leq \ \frac{\pi _ {1} }{\pi _ {2} } \ \frac{L _ {12} - L _ {11} }{L _ {21} - L _ {22} } , \\ d _ {2} ,\ \ \textrm{ if } \ \frac{p (x \mid \theta _ {2} ) }{p (x \mid \theta _ {1} ) } \ \geq \ \frac{\pi _ {1} }{\pi _ {2} } \ \frac{L _ {12} - L _ {11} }{L _ {21} - L _ {22} } . \\ \end{array} \right . $$

The advantage of the Bayesian approach consists in the fact that, unlike the losses $ \rho ( \theta ,\ \delta ) $, the expected losses $ \rho ( \pi ,\ \delta ) $ are numbers which are dependent on the unknown parameter $ \theta $, and, consequently, it is known that solutions $ \delta _ \epsilon ^ {*} $ for which

$$ \rho ( \pi ,\ \delta _ \epsilon ^ {*} ) \ \leq \ \inf _ \delta \ \rho ( \pi ,\ \delta ) + \epsilon , $$

and which are, if not optimal, at least $ \epsilon $- optimal $ ( \epsilon > 0) $, are certain to exist. The disadvantage of the Bayesian approach is the necessity of postulating both the existence of an a priori distribution of the unknown parameter and its precise form (the latter disadvantage may be overcome to a certain extent by adopting an empirical Bayesian approach, cf. Bayesian approach, empirical).

References

[1]	A. Wald, "Statistical decision functions" , Wiley (1950)
[2]	M.H. de Groot, "Optimal statistical decisions" , McGraw-Hill (1970)

How to Cite This Entry:
Bayesian approach. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Bayesian_approach&oldid=15310

This article was adapted from an original article by A.N. Shiryaev (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Bayesian approach"

Latest revision as of 11:31, 10 February 2020

References

@@ Line 1: / Line 1: @@
+{{TEX|done}}
 ''to statistical problems''
-An approach based on the assumption that to any parameter in a statistical problem there can be assigned a definite probability distribution. Any general statistical decision problem is determined by the following elements: by a space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153901.png" /> of (potential) samples <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153902.png" />, by a space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153903.png" /> of values of the unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153904.png" />, by a family of probability distributions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153905.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153906.png" />, by a space of decisions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153907.png" /> and by a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153908.png" />, which characterizes the losses caused by accepting the decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b0153909.png" /> when the true value of the parameter is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539010.png" />. The objective of decision making is to find in a certain sense an optimal rule (decision function) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539011.png" />, assigning to each result of an observation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539012.png" /> the decision <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539013.png" />. In the Bayesian approach, when it is assumed that the unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539014.png" /> is a random variable with a given (a priori) distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539015.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539016.png" /> the best decision function ([[Bayesian decision function|Bayesian decision function]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539017.png" /> is defined as the function for which the minimum expected loss <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539018.png" />, where
+An approach based on the assumption that to any parameter in a statistical problem there can be assigned a definite probability distribution. Any general statistical decision problem is determined by the following elements: by a space  $  (X,\  {\mathcal B} _ {X} ) $
+of (potential) samples  $  x $,
+by a space  $  ( \Theta ,\  {\mathcal B} _  \Theta  ) $
+of values of the unknown parameter  $  \theta $,
+by a family of probability distributions  $  \{ { {\mathsf P} _  \theta  } : {\theta \in \Theta } \} $
+on  $  (X,\  {\mathcal B} _ {X} ) $,
+by a space of decisions  $  (D,\  {\mathcal B} _ {D} ) $
+and by a function  $  L( \theta ,\  d) $,
+which characterizes the losses caused by accepting the decision  $  d $
+when the true value of the parameter is  $  \theta $.
+The objective of decision making is to find in a certain sense an optimal rule (decision function)  $  \delta = \delta (x) $,
+assigning to each result of an observation  $  x \in X $
+the decision  $  \delta (x) \in D $.
+In the Bayesian approach, when it is assumed that the unknown parameter  $  \theta $
+is a random variable with a given (a priori) distribution  $  \pi = \pi (d \theta ) $
+on  $  ( \Theta ,\  {\mathcal B} _  \Theta  ) $
+the best decision function ([[Bayesian decision function|Bayesian decision function]])  $  {\delta  ^ {*} } = {\delta  ^ {*} } (x) $
+is defined as the function for which the minimum expected loss  $  \inf _  \delta  \  \rho ( \pi ,\  \delta ) $,
+where
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539019.png" /></td> </tr></table>
+$$
+\rho ( \pi ,\  \delta ) \  = \
+\int\limits _  \Theta  \rho ( \theta ,\  \delta ) \
+\pi (d \theta ) ,
+$$
 and
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539020.png" /></td> </tr></table>
+$$
+\rho ( \theta ,\  \delta ) \  = \
+\int\limits _ { X } L ( \theta ,\  \delta (x)) \
+{\mathsf P} _  \theta  (dx)
+$$
 is attained. Thus,
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539021.png" /></td> </tr></table>
+$$
+\rho ( \pi ,\  \delta  ^ {*} ) \  = \
+\inf _  \delta  \
+\int\limits _  \Theta
+\int\limits _ { X }
+L ( \theta ,\  \delta (x)) \
+{\mathsf P} _  \theta  (dx) \
+\pi ( d \theta ) .
+$$
-In searching for the Bayesian decision function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539022.png" />, the following remark is useful. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539023.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539024.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539025.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539026.png" /> are certain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539027.png" />-finite measures. One then finds, assuming that the order of integration may be changed,
+In searching for the Bayesian decision function  $  \delta  ^ {*} = \delta  ^ {*} (x) $,
+the following remark is useful. Let  $  {\mathsf P} _  \theta  (dx) = p (x \mid  \theta ) \  d \mu (x) $,
+$  \pi (d \theta ) = \pi ( \theta ) \  d \nu ( \theta ) $,
+where  $  \mu $
+and  $  \nu $
+are certain  $  \sigma $-
+finite measures. One then finds, assuming that the order of integration may be changed,
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539028.png" /></td> </tr></table>
+$$
+\int\limits _  \Theta  \int\limits _ { X }
+L ( \theta ,\  \delta (x)) \
+{\mathsf P} _  \theta  (dx ) \  \pi ( d \theta )\  =
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539029.png" /></td> </tr></table>
+$$
+= \
+\int\limits _  \Theta  \int\limits _ { X } L ( \theta ,\  \delta (x)) p ( x
+\mid  \theta ) \pi ( \theta ) \  d \mu (x) \  d \nu ( \theta )\  =
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539030.png" /></td> </tr></table>
+$$
+= \
+\int\limits _ { X } \  d \mu (x) \left [ \int\limits _  \Theta  L ( \theta ,\
+\delta (x)) p (x \mid  \theta ) \pi ( \theta ) \  d \nu ( \theta ) \right ] .
+$$
-It is seen from the above that for a given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539031.png" /> is that value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539032.png" /> for which
+It is seen from the above that for a given  $  x \in X ,\  \delta  ^ {*} (x) $
+is that value of  $  d  ^ {*} $
+for which
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539033.png" /></td> </tr></table>
+$$
+\inf _ { d } \
+\int\limits _  \Theta
+L ( \theta ,\  d)
+p (x \mid  \theta ) \pi
+( \theta ) \  d \nu ( \theta )
+$$
 is attained, or, what is equivalent, for which
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539034.png" /></td> </tr></table>
+$$
+\inf _ { d } \
+\int\limits _  \Theta  L ( \theta ,\  d)
+\frac{p (x \mid  \theta ) \pi ( \theta ) }{p (x) }
+ \
+d \nu ( \theta ) ,
+$$
 is attained, where
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539035.png" /></td> </tr></table>
+$$
+p (x) \  = \
+\int\limits _  \Theta
+p (x \mid  \theta ) \pi ( \theta )
+\  d \nu ( \theta ) .
+$$
 But, according to the [[Bayes formula|Bayes formula]]
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539036.png" /></td> </tr></table>
+$$
+\int\limits _  \Theta  L( \theta ,\  d)
+\frac{p (x \mid  \theta ) \pi ( \theta ) }{p (x) }
+ \
+d \nu ( \theta ) \  = \
+{\mathsf E} [L ( \theta ,\  d) \mid  x].
+$$
+Thus, for a given  $  x $,
+$  \delta  ^ {*} (x) $
+is that value of  $  d  ^ {*} $
+for which the conditional average loss  $  {\mathsf E} [L ( \theta ,\  d) \mid  x] $
+attains a minimum.
+Example. (The Bayesian approach applied to the case of distinguishing between two simple hypotheses.) Let  $  \Theta = \{ \theta _ {1} ,\  \theta _ {2} \} $,
+$  D = \{ d _ {1} ,\  d _ {2} \} $,
+$  L _ {ij }  = L = ( \theta _ {i} ,\  d _ {j} ) $,
+$  i,\  j = 1,\  2 $;
+$  \pi ( \theta _ {1} ) = \pi _ {1} $,
+$  \pi ( \theta _ {2} ) = \pi _ {2} $,
+$  \pi _ {1} + \pi _ {2} = 1 $.
+If the solution  $  d _ {i} $
+is identified with the acceptance of the hypothesis  $  H _ {i} $:
+$  \theta = \theta _ {i} $,
+it is natural to assume that  $  L _ {11} < L _ {12} $,
+$  L _ {22} < L _ {21} $.
+Then
+$$
+\rho ( \pi ,\  \delta ) \  = \
+\int\limits _ { X }
+[ \pi _ {1} p (x \mid  \theta _ {1} )
+L ( \theta _ {1} ,\  \delta ( x)) +
+$$
-Thus, for a given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539037.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539038.png" /> is that value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539039.png" /> for which the conditional average loss <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539040.png" /> attains a minimum.
+$$
++
+{} \pi _ {2} p (x \mid  \theta _ {2} ) L ( \theta _ {2} ,\  \delta (x))] \  d \mu (x)
+$$
-Example. (The Bayesian approach applied to the case of distinguishing between two simple hypotheses.) Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539041.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539043.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539044.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539045.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539046.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539047.png" />. If the solution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539048.png" /> is identified with the acceptance of the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539049.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539050.png" />, it is natural to assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539051.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539052.png" />. Then
+implies that  $  \inf _  \delta  \  \rho ( \pi ,\  \delta ) $
+is attained for the function
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539053.png" /></td> </tr></table>
+$$
+\delta  ^ {*} (x) \  = \
+\left \{
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539054.png" /></td> </tr></table>
+\begin{array}{l}
+d _ {1} ,\ \  \textrm{ if } \
-implies that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539055.png" /> is attained for the function
+\frac{p (x \mid  \theta _ {2} ) }{p (x \mid  \theta _ {1} ) }
+ \  \leq  \
+\frac{\pi _ {1} }{\pi _ {2} }
+ \
+\frac{L _ {12} - L _ {11} }{L _ {21} - L _ {22} }
+ ,  \\
+d _ {2} ,\ \  \textrm{ if } \
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539056.png" /></td> </tr></table>
+\frac{p (x \mid  \theta _ {2} ) }{p (x \mid  \theta _ {1} ) }
+ \  \geq  \
+\frac{\pi _ {1} }{\pi _ {2} }
+ \
+\frac{L _ {12} - L _ {11} }{L _ {21} - L _ {22} }
+ .  \\
+\end{array}
+ \right .
+$$
-The advantage of the Bayesian approach consists in the fact that, unlike the losses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539057.png" />, the expected losses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539058.png" /> are numbers which are dependent on the unknown parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539059.png" />, and, consequently, it is known that solutions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539060.png" /> for which
+The advantage of the Bayesian approach consists in the fact that, unlike the losses  $  \rho ( \theta ,\  \delta ) $,
+the expected losses  $  \rho ( \pi ,\  \delta ) $
+are numbers which are dependent on the unknown parameter  $  \theta $,
+and, consequently, it is known that solutions  $  \delta _  \epsilon   ^ {*} $
+for which
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539061.png" /></td> </tr></table>
+$$
+\rho ( \pi ,\  \delta _  \epsilon   ^ {*} ) \  \leq  \
+\inf _  \delta  \
+\rho ( \pi ,\  \delta ) + \epsilon ,
+$$
-and which are, if not optimal, at least <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539062.png" />-optimal <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b015/b015390/b01539063.png" />, are certain to exist. The disadvantage of the Bayesian approach is the necessity of postulating both the existence of an a priori distribution of the unknown parameter and its precise form (the latter disadvantage may be overcome to a certain extent by adopting an empirical Bayesian approach, cf. [[Bayesian approach, empirical|Bayesian approach, empirical]]).
+and which are, if not optimal, at least  $  \epsilon $-
+optimal  $  ( \epsilon > 0) $,
+are certain to exist. The disadvantage of the Bayesian approach is the necessity of postulating both the existence of an a priori distribution of the unknown parameter and its precise form (the latter disadvantage may be overcome to a certain extent by adopting an empirical Bayesian approach, cf. [[Bayesian approach, empirical|Bayesian approach, empirical]]).
 ====References====
 <table><TR><TD valign="top">[1]</TD> <TD valign="top">  A. Wald,   "Statistical decision functions" , Wiley  (1950)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  M.H. de Groot,   "Optimal statistical decisions" , McGraw-Hill  (1970)</TD></TR></table>