Namespaces
Variants
Actions

Difference between revisions of "Statistical test"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (Undo revision 48821 by Ulf Rehmann (talk))
Tag: Undo
m (tex encoded by computer)
Line 1: Line 1:
 +
<!--
 +
s0874801.png
 +
$#A+1 = 56 n = 0
 +
$#C+1 = 56 : ~/encyclopedia/old_files/data/S087/S.0807480 Statistical test
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations.
 
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations.
  
Assume that the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874801.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874802.png" /> has to be tested against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874803.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874804.png" /> by means of the realization <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874805.png" /> of a random vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874806.png" /> that takes values in a sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874807.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874808.png" />. Furthermore, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s0874809.png" /> be an arbitrary <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748010.png" />-measurable function, mapping the sample space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748011.png" /> onto the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748012.png" />. In a case like this, the principle according to which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748013.png" /> is rejected with probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748014.png" />, while the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748015.png" /> is rejected with probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748016.png" />, is called a statistical test for testing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748017.png" /> against <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748018.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748019.png" /> is the critical function of the test. The function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748020.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748021.png" />, is called the power function of the test.
+
Assume that the hypothesis $  H _ {0} $:  
 +
$  \theta \in \Theta _ {0} \subset  \Theta $
 +
has to be tested against the alternative $  H _ {1} $:  
 +
$  \theta \in \Theta _ {1} = \Theta \setminus  \Theta _ {0} $
 +
by means of the realization $  x = ( x _ {1} \dots x _ {n} ) $
 +
of a random vector $  X = ( X _ {1} \dots X _ {n} ) $
 +
that takes values in a sample space $  ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _  \theta  ^ {n} ) $,  
 +
$  \theta \in \Theta $.  
 +
Furthermore, let $  \phi _ {n} ( \cdot ) $
 +
be an arbitrary $  {\mathcal B} _ {n} $-
 +
measurable function, mapping the sample space $  \mathfrak X _ {n} $
 +
onto the interval $  [ 0, 1] $.  
 +
In a case like this, the principle according to which $  H _ {0} $
 +
is rejected with probability $  \phi _ {n} ( X) $,  
 +
while the alternative $  H _ {1} $
 +
is rejected with probability $  1 - \phi _ {n} ( X) $,  
 +
is called a statistical test for testing $  H _ {0} $
 +
against $  H _ {1} $;  
 +
$  \phi _ {n} ( \cdot ) $
 +
is the critical function of the test. The function $  \beta ( \theta ) = {\mathsf E} _  \theta  \phi _ {n} ( X) $,
 +
$  \theta \in \Theta $,
 +
is called the power function of the test.
 +
 
 +
The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of  $  H _ {0} $,
 +
and thus acceptance of  $  H _ {1} $,
 +
when in fact  $  H _ {0} $
 +
is correct (an error of the first kind), or acceptance of  $  H _ {0} $
 +
when in fact  $  H _ {1} $
 +
is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound  $  \alpha = \sup _ {\theta \in \Theta _ {0}  }  \beta _ {n} ( \theta ) $,
 +
0 < \alpha < 1 $,
 +
for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number  $  \alpha $
 +
is called the [[Significance level|significance level]] of the statistical test.
 +
 
 +
In practice, the most important are non-randomized statistical tests, i.e. those with as critical function  $  \phi _ {n} ( \cdot ) $
 +
the indicator function of a certain  $  {\mathcal B} _ {n} $-
 +
measurable set  $  K $
 +
in  $  \mathfrak X $:
  
The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748022.png" />, and thus acceptance of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748023.png" />, when in fact <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748024.png" /> is correct (an error of the first kind), or acceptance of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748025.png" /> when in fact <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748026.png" /> is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748027.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748028.png" />, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748029.png" /> is called the [[Significance level|significance level]] of the statistical test.
+
$$
 +
\phi _ {n} ( x) = \left \{
  
In practice, the most important are non-randomized statistical tests, i.e. those with as critical function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748030.png" /> the indicator function of a certain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748031.png" />-measurable set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748032.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748033.png" />:
+
\begin{array}{ll}
 +
1  & \textrm{ if }  x \in K, \\
 +
0 & \textrm{ if }  x \in \overline{K}\; = \mathfrak X _ {n} \setminus  K. \\
 +
\end{array}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748034.png" /></td> </tr></table>
+
\right .$$
  
Thus, a non-randomized statistical test rejects <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748035.png" /> if the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748036.png" /> takes place; on the other hand, if the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748037.png" /> takes place, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748038.png" /> is accepted. The set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748039.png" /> is called the critical region of the statistical test.
+
Thus, a non-randomized statistical test rejects $  H _ {0} $
 +
if the event $  \{ X \in K \} $
 +
takes place; on the other hand, if the event $  \{ X \in \overline{K}\; \} $
 +
takes place, then $  H _ {0} $
 +
is accepted. The set $  K $
 +
is called the critical region of the statistical test.
  
As a rule, a non-randomized statistical test is based on a certain statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748040.png" />, which is called the test statistic, and the critical region <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748041.png" /> of this same test is usually defined using relations of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748043.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748044.png" />. The constants <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748045.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748046.png" />, called the critical values of the test statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748047.png" />, are defined from the condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748048.png" />; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748049.png" /> reflects the particular nature of the competing hypotheses <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748050.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748051.png" />. In the case where the family <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748052.png" /> possesses a sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748053.png" />, it is natural to look for the test statistic in the class of sufficient statistics, since
+
As a rule, a non-randomized statistical test is based on a certain statistic $  T _ {n} = T _ {n} ( X) $,  
 +
which is called the test statistic, and the critical region $  K $
 +
of this same test is usually defined using relations of the form $  K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $,
 +
$  K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $,  
 +
$  K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $.  
 +
The constants $  t _ {1} $,  
 +
$  t _ {2} $,  
 +
called the critical values of the test statistic $  T _ {n} $,  
 +
are defined from the condition $  \alpha = \sup _ {\theta \in \Theta _ {0}  }  \beta _ {n} ( \theta ) $;  
 +
in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $  T _ {n} $
 +
reflects the particular nature of the competing hypotheses $  H _ {0} $
 +
and $  H _ {1} $.  
 +
In the case where the family $  \{ { {\mathsf P} _  \theta  ^ {n} } : {\theta \in \Theta } \} $
 +
possesses a sufficient statistic $  \Psi = \Psi ( X) $,  
 +
it is natural to look for the test statistic in the class of sufficient statistics, since
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748054.png" /></td> </tr></table>
+
$$
 +
\beta _ {n} ( \theta )  = {\mathsf E} _  \theta  \phi _ {n} ( X)  = {\mathsf E} _  \theta  T _ {n} ( X)
 +
$$
  
for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748055.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s087/s087480/s08748056.png" />.
+
for all $  \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $,  
 +
where $  T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X)  \mid  \Psi \} $.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  J. Hájek,  Z. Sidák,  "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  H. Cramér,  "Mathematical methods of statistics" , Princeton Univ. Press  (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  M.S. Nikulin,  "A result of Bol'shev's from the theory of the statistical testing of hypotheses"  ''J. Soviet Math.'' , '''44''' :  3  (1989)  pp. 522–529  ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153'''  (1986)  pp. 129–137</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  E.L. Lehmann,  "Testing statistical hypotheses" , Wiley  (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  J. Hájek,  Z. Sidák,  "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  H. Cramér,  "Mathematical methods of statistics" , Princeton Univ. Press  (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  M.S. Nikulin,  "A result of Bol'shev's from the theory of the statistical testing of hypotheses"  ''J. Soviet Math.'' , '''44''' :  3  (1989)  pp. 522–529  ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153'''  (1986)  pp. 129–137</TD></TR></table>

Revision as of 14:55, 7 June 2020


A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.

Assume that the hypothesis $ H _ {0} $: $ \theta \in \Theta _ {0} \subset \Theta $ has to be tested against the alternative $ H _ {1} $: $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, $ \theta \in \Theta $. Furthermore, let $ \phi _ {n} ( \cdot ) $ be an arbitrary $ {\mathcal B} _ {n} $- measurable function, mapping the sample space $ \mathfrak X _ {n} $ onto the interval $ [ 0, 1] $. In a case like this, the principle according to which $ H _ {0} $ is rejected with probability $ \phi _ {n} ( X) $, while the alternative $ H _ {1} $ is rejected with probability $ 1 - \phi _ {n} ( X) $, is called a statistical test for testing $ H _ {0} $ against $ H _ {1} $; $ \phi _ {n} ( \cdot ) $ is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, $ \theta \in \Theta $, is called the power function of the test.

The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, and thus acceptance of $ H _ {1} $, when in fact $ H _ {0} $ is correct (an error of the first kind), or acceptance of $ H _ {0} $ when in fact $ H _ {1} $ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, $ 0 < \alpha < 1 $, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ is called the significance level of the statistical test.

In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ the indicator function of a certain $ {\mathcal B} _ {n} $- measurable set $ K $ in $ \mathfrak X $:

$$ \phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$

Thus, a non-randomized statistical test rejects $ H _ {0} $ if the event $ \{ X \in K \} $ takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ takes place, then $ H _ {0} $ is accepted. The set $ K $ is called the critical region of the statistical test.

As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, which is called the test statistic, and the critical region $ K $ of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. The constants $ t _ {1} $, $ t _ {2} $, called the critical values of the test statistic $ T _ {n} $, are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ reflects the particular nature of the competing hypotheses $ H _ {0} $ and $ H _ {1} $. In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ possesses a sufficient statistic $ \Psi = \Psi ( X) $, it is natural to look for the test statistic in the class of sufficient statistics, since

$$ \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) $$

for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $.

References

[1] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)
[2] J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)
[3] H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)
[4] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)
[5] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[6] M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137
How to Cite This Entry:
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=49445
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article