Difference between revisions of "Statistical test"
Ulf Rehmann (talk | contribs) m (Undo revision 48821 by Ulf Rehmann (talk)) Tag: Undo |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | s0874801.png | ||
+ | $#A+1 = 56 n = 0 | ||
+ | $#C+1 = 56 : ~/encyclopedia/old_files/data/S087/S.0807480 Statistical test | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations. | A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations. | ||
− | Assume that the hypothesis | + | Assume that the hypothesis $ H _ {0} $: |
+ | $ \theta \in \Theta _ {0} \subset \Theta $ | ||
+ | has to be tested against the alternative $ H _ {1} $: | ||
+ | $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ | ||
+ | by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ | ||
+ | of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ | ||
+ | that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, | ||
+ | $ \theta \in \Theta $. | ||
+ | Furthermore, let $ \phi _ {n} ( \cdot ) $ | ||
+ | be an arbitrary $ {\mathcal B} _ {n} $- | ||
+ | measurable function, mapping the sample space $ \mathfrak X _ {n} $ | ||
+ | onto the interval $ [ 0, 1] $. | ||
+ | In a case like this, the principle according to which $ H _ {0} $ | ||
+ | is rejected with probability $ \phi _ {n} ( X) $, | ||
+ | while the alternative $ H _ {1} $ | ||
+ | is rejected with probability $ 1 - \phi _ {n} ( X) $, | ||
+ | is called a statistical test for testing $ H _ {0} $ | ||
+ | against $ H _ {1} $; | ||
+ | $ \phi _ {n} ( \cdot ) $ | ||
+ | is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, | ||
+ | $ \theta \in \Theta $, | ||
+ | is called the power function of the test. | ||
+ | |||
+ | The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, | ||
+ | and thus acceptance of $ H _ {1} $, | ||
+ | when in fact $ H _ {0} $ | ||
+ | is correct (an error of the first kind), or acceptance of $ H _ {0} $ | ||
+ | when in fact $ H _ {1} $ | ||
+ | is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, | ||
+ | $ 0 < \alpha < 1 $, | ||
+ | for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ | ||
+ | is called the [[Significance level|significance level]] of the statistical test. | ||
+ | |||
+ | In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ | ||
+ | the indicator function of a certain $ {\mathcal B} _ {n} $- | ||
+ | measurable set $ K $ | ||
+ | in $ \mathfrak X $: | ||
− | + | $$ | |
+ | \phi _ {n} ( x) = \left \{ | ||
− | + | \begin{array}{ll} | |
+ | 1 & \textrm{ if } x \in K, \\ | ||
+ | 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ | ||
+ | \end{array} | ||
− | + | \right .$$ | |
− | Thus, a non-randomized statistical test rejects | + | Thus, a non-randomized statistical test rejects $ H _ {0} $ |
+ | if the event $ \{ X \in K \} $ | ||
+ | takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ | ||
+ | takes place, then $ H _ {0} $ | ||
+ | is accepted. The set $ K $ | ||
+ | is called the critical region of the statistical test. | ||
− | As a rule, a non-randomized statistical test is based on a certain statistic | + | As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, |
+ | which is called the test statistic, and the critical region $ K $ | ||
+ | of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, | ||
+ | $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, | ||
+ | $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. | ||
+ | The constants $ t _ {1} $, | ||
+ | $ t _ {2} $, | ||
+ | called the critical values of the test statistic $ T _ {n} $, | ||
+ | are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; | ||
+ | in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ | ||
+ | reflects the particular nature of the competing hypotheses $ H _ {0} $ | ||
+ | and $ H _ {1} $. | ||
+ | In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ | ||
+ | possesses a sufficient statistic $ \Psi = \Psi ( X) $, | ||
+ | it is natural to look for the test statistic in the class of sufficient statistics, since | ||
− | + | $$ | |
+ | \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) | ||
+ | $$ | ||
− | for all | + | for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, |
+ | where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" ''J. Soviet Math.'' , '''44''' : 3 (1989) pp. 522–529 ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153''' (1986) pp. 129–137</TD></TR></table> | <table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" ''J. Soviet Math.'' , '''44''' : 3 (1989) pp. 522–529 ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153''' (1986) pp. 129–137</TD></TR></table> |
Revision as of 14:55, 7 June 2020
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.
Assume that the hypothesis $ H _ {0} $: $ \theta \in \Theta _ {0} \subset \Theta $ has to be tested against the alternative $ H _ {1} $: $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, $ \theta \in \Theta $. Furthermore, let $ \phi _ {n} ( \cdot ) $ be an arbitrary $ {\mathcal B} _ {n} $- measurable function, mapping the sample space $ \mathfrak X _ {n} $ onto the interval $ [ 0, 1] $. In a case like this, the principle according to which $ H _ {0} $ is rejected with probability $ \phi _ {n} ( X) $, while the alternative $ H _ {1} $ is rejected with probability $ 1 - \phi _ {n} ( X) $, is called a statistical test for testing $ H _ {0} $ against $ H _ {1} $; $ \phi _ {n} ( \cdot ) $ is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, $ \theta \in \Theta $, is called the power function of the test.
The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, and thus acceptance of $ H _ {1} $, when in fact $ H _ {0} $ is correct (an error of the first kind), or acceptance of $ H _ {0} $ when in fact $ H _ {1} $ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, $ 0 < \alpha < 1 $, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ is called the significance level of the statistical test.
In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ the indicator function of a certain $ {\mathcal B} _ {n} $- measurable set $ K $ in $ \mathfrak X $:
$$ \phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$
Thus, a non-randomized statistical test rejects $ H _ {0} $ if the event $ \{ X \in K \} $ takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ takes place, then $ H _ {0} $ is accepted. The set $ K $ is called the critical region of the statistical test.
As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, which is called the test statistic, and the critical region $ K $ of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. The constants $ t _ {1} $, $ t _ {2} $, called the critical values of the test statistic $ T _ {n} $, are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ reflects the particular nature of the competing hypotheses $ H _ {0} $ and $ H _ {1} $. In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ possesses a sufficient statistic $ \Psi = \Psi ( X) $, it is natural to look for the test statistic in the class of sufficient statistics, since
$$ \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) $$
for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $.
References
[1] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |
[2] | J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967) |
[3] | H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) |
[4] | B.L. van der Waerden, "Mathematische Statistik" , Springer (1957) |
[5] | L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) |
[6] | M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137 |
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=49603