Difference between revisions of "Statistical test"

Revision as of 14:55, 7 June 2020

A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.

Assume that the hypothesis $ H _ {0} $: $ \theta \in \Theta _ {0} \subset \Theta $ has to be tested against the alternative $ H _ {1} $: $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, $ \theta \in \Theta $. Furthermore, let $ \phi _ {n} ( \cdot ) $ be an arbitrary $ {\mathcal B} _ {n} $- measurable function, mapping the sample space $ \mathfrak X _ {n} $ onto the interval $ [ 0, 1] $. In a case like this, the principle according to which $ H _ {0} $ is rejected with probability $ \phi _ {n} ( X) $, while the alternative $ H _ {1} $ is rejected with probability $ 1 - \phi _ {n} ( X) $, is called a statistical test for testing $ H _ {0} $ against $ H _ {1} $; $ \phi _ {n} ( \cdot ) $ is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, $ \theta \in \Theta $, is called the power function of the test.

The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, and thus acceptance of $ H _ {1} $, when in fact $ H _ {0} $ is correct (an error of the first kind), or acceptance of $ H _ {0} $ when in fact $ H _ {1} $ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, $ 0 < \alpha < 1 $, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ is called the significance level of the statistical test.

In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ the indicator function of a certain $ {\mathcal B} _ {n} $- measurable set $ K $ in $ \mathfrak X $:

$$ \phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$

Thus, a non-randomized statistical test rejects $ H _ {0} $ if the event $ \{ X \in K \} $ takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ takes place, then $ H _ {0} $ is accepted. The set $ K $ is called the critical region of the statistical test.

As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, which is called the test statistic, and the critical region $ K $ of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. The constants $ t _ {1} $, $ t _ {2} $, called the critical values of the test statistic $ T _ {n} $, are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ reflects the particular nature of the competing hypotheses $ H _ {0} $ and $ H _ {1} $. In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ possesses a sufficient statistic $ \Psi = \Psi ( X) $, it is natural to look for the test statistic in the class of sufficient statistics, since

$$ \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) $$

for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $.

References

[1]	E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)
[2]	J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)
[3]	H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)
[4]	B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)
[5]	L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[6]	M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137

How to Cite This Entry:
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=49603

This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Statistical test"

Revision as of 14:55, 7 June 2020

References

@@ Line 1: / Line 1: @@
+<!--
+s0874801.png
+$#A+1 = 56 n = 0
+$#C+1 = 56 : ~/encyclopedia/old_files/data/S087/S.0807480 Statistical test
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
+{{TEX|auto}}
+{{TEX|done}}
 A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations.
-Assume that the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874801.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874802.png" /> has to be tested against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874803.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874804.png" /> by means of the realization <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874805.png" /> of a random vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874806.png" /> that takes values in a sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874807.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874808.png" />. Furthermore, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874809.png" /> be an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748010.png" />-measurable function, mapping the sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748011.png" /> onto the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748012.png" />. In a case like this, the principle according to which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748013.png" /> is rejected with probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748014.png" />, while the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748015.png" /> is rejected with probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748016.png" />, is called a statistical test for testing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748017.png" /> against <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748018.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748019.png" /> is the critical function of the test. The function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748020.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748021.png" />, is called the power function of the test.
+Assume that the hypothesis  $  H _ {0} $:
+$  \theta \in \Theta _ {0} \subset  \Theta $
+has to be tested against the alternative  $  H _ {1} $:
+$  \theta \in \Theta _ {1} = \Theta \setminus  \Theta _ {0} $
+by means of the realization  $  x = ( x _ {1} \dots x _ {n} ) $
+of a random vector  $  X = ( X _ {1} \dots X _ {n} ) $
+that takes values in a sample space  $  ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _  \theta   ^ {n} ) $,
+$  \theta \in \Theta $.
+Furthermore, let  $  \phi _ {n} ( \cdot ) $
+be an arbitrary  $  {\mathcal B} _ {n} $-
+measurable function, mapping the sample space  $  \mathfrak X _ {n} $
+onto the interval  $  [ 0, 1] $.
+In a case like this, the principle according to which  $  H _ {0} $
+is rejected with probability  $  \phi _ {n} ( X) $,
+while the alternative  $  H _ {1} $
+is rejected with probability  $  1 - \phi _ {n} ( X) $,
+is called a statistical test for testing  $  H _ {0} $
+against  $  H _ {1} $;
+$  \phi _ {n} ( \cdot ) $
+is the critical function of the test. The function  $  \beta ( \theta ) = {\mathsf E} _  \theta  \phi _ {n} ( X) $,
+$  \theta \in \Theta $,
+is called the power function of the test.
+The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of  $  H _ {0} $,
+and thus acceptance of  $  H _ {1} $,
+when in fact  $  H _ {0} $
+is correct (an error of the first kind), or acceptance of  $  H _ {0} $
+when in fact  $  H _ {1} $
+is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound  $  \alpha = \sup _ {\theta \in \Theta _ {0}  }  \beta _ {n} ( \theta ) $,
+$  0 < \alpha < 1 $,
+for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number  $  \alpha $
+is called the [[Significance level|significance level]] of the statistical test.
+In practice, the most important are non-randomized statistical tests, i.e. those with as critical function  $  \phi _ {n} ( \cdot ) $
+the indicator function of a certain  $  {\mathcal B} _ {n} $-
+measurable set  $  K $
+in  $  \mathfrak X $:
-The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748022.png" />, and thus acceptance of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748023.png" />, when in fact <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748024.png" /> is correct (an error of the first kind), or acceptance of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748025.png" /> when in fact <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748026.png" /> is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748027.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748028.png" />, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748029.png" /> is called the [[Significance level|significance level]] of the statistical test.
+$$
+\phi _ {n} ( x)  =  \left \{
-In practice, the most important are non-randomized statistical tests, i.e. those with as critical function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748030.png" /> the indicator function of a certain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748031.png" />-measurable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748032.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748033.png" />:
+\begin{array}{ll}
+  & \textrm{ if }  x \in K,  \\
+& \textrm{ if }  x \in \overline{K}\; = \mathfrak X _ {n} \setminus  K.  \\
+\end{array}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748034.png" /></td> </tr></table>
+ \right .$$
-Thus, a non-randomized statistical test rejects <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748035.png" /> if the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748036.png" /> takes place; on the other hand, if the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748037.png" /> takes place, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748038.png" /> is accepted. The set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748039.png" /> is called the critical region of the statistical test.
+Thus, a non-randomized statistical test rejects  $  H _ {0} $
+if the event  $  \{ X \in K \} $
+takes place; on the other hand, if the event  $  \{ X \in \overline{K}\; \} $
+takes place, then  $  H _ {0} $
+is accepted. The set  $  K $
+is called the critical region of the statistical test.
-As a rule, a non-randomized statistical test is based on a certain statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748040.png" />, which is called the test statistic, and the critical region <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748041.png" /> of this same test is usually defined using relations of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748043.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748044.png" />. The constants <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748045.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748046.png" />, called the critical values of the test statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748047.png" />, are defined from the condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748048.png" />; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748049.png" /> reflects the particular nature of the competing hypotheses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748050.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748051.png" />. In the case where the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748052.png" /> possesses a sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748053.png" />, it is natural to look for the test statistic in the class of sufficient statistics, since
+As a rule, a non-randomized statistical test is based on a certain statistic  $  T _ {n} = T _ {n} ( X) $,
+which is called the test statistic, and the critical region  $  K $
+of this same test is usually defined using relations of the form  $  K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $,
+$  K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $,
+$  K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $.
+The constants  $  t _ {1} $,
+$  t _ {2} $,
+called the critical values of the test statistic  $  T _ {n} $,
+are defined from the condition  $  \alpha = \sup _ {\theta \in \Theta _ {0}  }  \beta _ {n} ( \theta ) $;
+in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of  $  T _ {n} $
+reflects the particular nature of the competing hypotheses  $  H _ {0} $
+and  $  H _ {1} $.
+In the case where the family  $  \{ { {\mathsf P} _  \theta   ^ {n} } : {\theta \in \Theta } \} $
+possesses a sufficient statistic  $  \Psi = \Psi ( X) $,
+it is natural to look for the test statistic in the class of sufficient statistics, since
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748054.png" /></td> </tr></table>
+$$
+\beta _ {n} ( \theta )  =  {\mathsf E} _  \theta  \phi _ {n} ( X)  =  {\mathsf E} _  \theta  T _ {n} ( X)
+$$
-for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748055.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748056.png" />.
+for all  $  \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $,
+where  $  T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X)  \mid   \Psi \} $.
 ====References====
 <table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,   "Testing statistical hypotheses" , Wiley  (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  J. Hájek,   Z. Sidák,   "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  H. Cramér,   "Mathematical methods of statistics" , Princeton Univ. Press  (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  B.L. van der Waerden,   "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  L.N. Bol'shev,   N.V. Smirnov,   "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  M.S. Nikulin,   "A result of Bol'shev's from the theory of the statistical testing of hypotheses"  ''J. Soviet Math.'' , '''44''' :  3  (1989)  pp. 522–529  ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153'''  (1986)  pp. 129–137</TD></TR></table>