Statistical test

A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.

Assume that the hypothesis $ H _ {0} $: $ \theta \in \Theta _ {0} \subset \Theta $ has to be tested against the alternative $ H _ {1} $: $ \theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0} $ by means of the realization $ x = ( x _ {1} \dots x _ {n} ) $ of a random vector $ X = ( X _ {1} \dots X _ {n} ) $ that takes values in a sample space $ ( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} ) $, $ \theta \in \Theta $. Furthermore, let $ \phi _ {n} ( \cdot ) $ be an arbitrary $ {\mathcal B} _ {n} $- measurable function, mapping the sample space $ \mathfrak X _ {n} $ onto the interval $ [ 0, 1] $. In a case like this, the principle according to which $ H _ {0} $ is rejected with probability $ \phi _ {n} ( X) $, while the alternative $ H _ {1} $ is rejected with probability $ 1 - \phi _ {n} ( X) $, is called a statistical test for testing $ H _ {0} $ against $ H _ {1} $; $ \phi _ {n} ( \cdot ) $ is the critical function of the test. The function $ \beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) $, $ \theta \in \Theta $, is called the power function of the test.

The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $ H _ {0} $, and thus acceptance of $ H _ {1} $, when in fact $ H _ {0} $ is correct (an error of the first kind), or acceptance of $ H _ {0} $ when in fact $ H _ {1} $ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $, $ 0 < \alpha < 1 $, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $ \alpha $ is called the significance level of the statistical test.

In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $ \phi _ {n} ( \cdot ) $ the indicator function of a certain $ {\mathcal B} _ {n} $- measurable set $ K $ in $ \mathfrak X $:

$$ \phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$

Thus, a non-randomized statistical test rejects $ H _ {0} $ if the event $ \{ X \in K \} $ takes place; on the other hand, if the event $ \{ X \in \overline{K}\; \} $ takes place, then $ H _ {0} $ is accepted. The set $ K $ is called the critical region of the statistical test.

As a rule, a non-randomized statistical test is based on a certain statistic $ T _ {n} = T _ {n} ( X) $, which is called the test statistic, and the critical region $ K $ of this same test is usually defined using relations of the form $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} $, $ K = \{ {x } : {T _ {n} ( x) > t _ {2} } \} $, $ K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \} $. The constants $ t _ {1} $, $ t _ {2} $, called the critical values of the test statistic $ T _ {n} $, are defined from the condition $ \alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta ) $; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $ T _ {n} $ reflects the particular nature of the competing hypotheses $ H _ {0} $ and $ H _ {1} $. In the case where the family $ \{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \} $ possesses a sufficient statistic $ \Psi = \Psi ( X) $, it is natural to look for the test statistic in the class of sufficient statistics, since

$$ \beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X) $$

for all $ \theta \in \Theta = \Theta _ {0} \cup \Theta _ {1} $, where $ T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \} $.

References

[1]	E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986)
[2]	J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)
[3]	H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)
[4]	B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)
[5]	L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[6]	M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137

How to Cite This Entry:
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=51750

This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Statistical test

References