Difference between revisions of "Statistical test"
Ulf Rehmann (talk | contribs) m (Undo revision 48821 by Ulf Rehmann (talk)) Tag: Undo |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
| Line 1: | Line 1: | ||
| + | <!-- | ||
| + | s0874801.png | ||
| + | $#A+1 = 56 n = 0 | ||
| + | $#C+1 = 56 : ~/encyclopedia/old_files/data/S087/S.0807480 Statistical test | ||
| + | Automatically converted into TeX, above some diagnostics. | ||
| + | Please remove this comment and the {{TEX|auto}} line below, | ||
| + | if TeX found to be correct. | ||
| + | --> | ||
| + | |||
| + | {{TEX|auto}} | ||
| + | {{TEX|done}} | ||
| + | |||
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations. | A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]) on the basis of results of observations. | ||
| − | Assume that the hypothesis | + | Assume that the hypothesis $ H _ {0} $: |
| + | $ \theta \in \Theta _ {0} \subset \Theta $ | ||
| + | has to be tested against the alternative $ H _ {1} $: | ||
| + | $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ | ||
| + | by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ | ||
| + | of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ | ||
| + | that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, | ||
| + | $ \theta \in \Theta $. | ||
| + | Furthermore, let $ \phi _ {n} ( \cdot ) $ | ||
| + | be an arbitrary $ {\mathcal B} _ {n} $- | ||
| + | measurable function, mapping the sample space $ \mathfrak X _ {n} $ | ||
| + | onto the interval $ [ 0, 1] $. | ||
| + | In a case like this, the principle according to which $ H _ {0} $ | ||
| + | is rejected with probability $ \phi _ {n} ( X) $, | ||
| + | while the alternative $ H _ {1} $ | ||
| + | is rejected with probability $ 1 - \phi _ {n} ( X) $, | ||
| + | is called a statistical test for testing $ H _ {0} $ | ||
| + | against $ H _ {1} $; | ||
| + | $ \phi _ {n} ( \cdot ) $ | ||
| + | is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, | ||
| + | $ \theta \in \Theta $, | ||
| + | is called the power function of the test. | ||
| + | |||
| + | The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, | ||
| + | and thus acceptance of $ H _ {1} $, | ||
| + | when in fact $ H _ {0} $ | ||
| + | is correct (an error of the first kind), or acceptance of $ H _ {0} $ | ||
| + | when in fact $ H _ {1} $ | ||
| + | is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, | ||
| + | $ 0 < \alpha < 1 $, | ||
| + | for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ | ||
| + | is called the [[Significance level|significance level]] of the statistical test. | ||
| + | |||
| + | In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ | ||
| + | the indicator function of a certain $ {\mathcal B} _ {n} $- | ||
| + | measurable set $ K $ | ||
| + | in $ \mathfrak X $: | ||
| − | + | $$ | |
| + | \phi _ {n} ( x) = \left \{ | ||
| − | + | \begin{array}{ll} | |
| + | 1 & \textrm{ if } x \in K, \\ | ||
| + | 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ | ||
| + | \end{array} | ||
| − | + | \right .$$ | |
| − | Thus, a non-randomized statistical test rejects | + | Thus, a non-randomized statistical test rejects $ H _ {0} $ |
| + | if the event $ \{ X \in K \} $ | ||
| + | takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ | ||
| + | takes place, then $ H _ {0} $ | ||
| + | is accepted. The set $ K $ | ||
| + | is called the critical region of the statistical test. | ||
| − | As a rule, a non-randomized statistical test is based on a certain statistic | + | As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, |
| + | which is called the test statistic, and the critical region $ K $ | ||
| + | of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, | ||
| + | $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, | ||
| + | $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. | ||
| + | The constants $ t _ {1} $, | ||
| + | $ t _ {2} $, | ||
| + | called the critical values of the test statistic $ T _ {n} $, | ||
| + | are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; | ||
| + | in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ | ||
| + | reflects the particular nature of the competing hypotheses $ H _ {0} $ | ||
| + | and $ H _ {1} $. | ||
| + | In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ | ||
| + | possesses a sufficient statistic $ \Psi = \Psi ( X) $, | ||
| + | it is natural to look for the test statistic in the class of sufficient statistics, since | ||
| − | + | $$ | |
| + | \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) | ||
| + | $$ | ||
| − | for all | + | for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, |
| + | where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $. | ||
====References==== | ====References==== | ||
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" ''J. Soviet Math.'' , '''44''' : 3 (1989) pp. 522–529 ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153''' (1986) pp. 129–137</TD></TR></table> | <table><TR><TD valign="top">[1]</TD> <TD valign="top"> E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" ''J. Soviet Math.'' , '''44''' : 3 (1989) pp. 522–529 ''Zap. Nauchn. Sem. Mat. Inst. Steklov.'' , '''153''' (1986) pp. 129–137</TD></TR></table> | ||
Revision as of 14:55, 7 June 2020
A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.
Assume that the hypothesis $ H _ {0} $: $ \theta \in \Theta _ {0} \subset \Theta $ has to be tested against the alternative $ H _ {1} $: $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, $ \theta \in \Theta $. Furthermore, let $ \phi _ {n} ( \cdot ) $ be an arbitrary $ {\mathcal B} _ {n} $- measurable function, mapping the sample space $ \mathfrak X _ {n} $ onto the interval $ [ 0, 1] $. In a case like this, the principle according to which $ H _ {0} $ is rejected with probability $ \phi _ {n} ( X) $, while the alternative $ H _ {1} $ is rejected with probability $ 1 - \phi _ {n} ( X) $, is called a statistical test for testing $ H _ {0} $ against $ H _ {1} $; $ \phi _ {n} ( \cdot ) $ is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, $ \theta \in \Theta $, is called the power function of the test.
The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, and thus acceptance of $ H _ {1} $, when in fact $ H _ {0} $ is correct (an error of the first kind), or acceptance of $ H _ {0} $ when in fact $ H _ {1} $ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, $ 0 < \alpha < 1 $, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ is called the significance level of the statistical test.
In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ the indicator function of a certain $ {\mathcal B} _ {n} $- measurable set $ K $ in $ \mathfrak X $:
$$ \phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$
Thus, a non-randomized statistical test rejects $ H _ {0} $ if the event $ \{ X \in K \} $ takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ takes place, then $ H _ {0} $ is accepted. The set $ K $ is called the critical region of the statistical test.
As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, which is called the test statistic, and the critical region $ K $ of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. The constants $ t _ {1} $, $ t _ {2} $, called the critical values of the test statistic $ T _ {n} $, are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ reflects the particular nature of the competing hypotheses $ H _ {0} $ and $ H _ {1} $. In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ possesses a sufficient statistic $ \Psi = \Psi ( X) $, it is natural to look for the test statistic in the class of sufficient statistics, since
$$ \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) $$
for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $.
References
| [1] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |
| [2] | J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967) |
| [3] | H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) |
| [4] | B.L. van der Waerden, "Mathematische Statistik" , Springer (1957) |
| [5] | L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) |
| [6] | M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137 |
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=49445