# Statistical test

A decision rule according to which a decision is taken in the problem of statistical hypotheses testing (cf. Statistical hypotheses, verification of) on the basis of results of observations.

Assume that the hypothesis $H _ {0}$: $\theta \in \Theta _ {0} \subset \Theta$ has to be tested against the alternative $H _ {1}$: $\theta \in \Theta _ {1} = \Theta \setminus \Theta _ {0}$ by means of the realization $x = ( x _ {1} \dots x _ {n} )$ of a random vector $X = ( X _ {1} \dots X _ {n} )$ that takes values in a sample space $( \mathfrak X _ {n} , {\mathcal B} _ {n} , {\mathsf P} _ \theta ^ {n} )$, $\theta \in \Theta$. Furthermore, let $\phi _ {n} ( \cdot )$ be an arbitrary ${\mathcal B} _ {n}$- measurable function, mapping the sample space $\mathfrak X _ {n}$ onto the interval $[ 0, 1]$. In a case like this, the principle according to which $H _ {0}$ is rejected with probability $\phi _ {n} ( X)$, while the alternative $H _ {1}$ is rejected with probability $1 - \phi _ {n} ( X)$, is called a statistical test for testing $H _ {0}$ against $H _ {1}$; $\phi _ {n} ( \cdot )$ is the critical function of the test. The function $\beta ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X)$, $\theta \in \Theta$, is called the power function of the test.

The use of a statistical test leads either to a correct decision being taken, or to one of the following two errors being made: rejection of $H _ {0}$, and thus acceptance of $H _ {1}$, when in fact $H _ {0}$ is correct (an error of the first kind), or acceptance of $H _ {0}$ when in fact $H _ {1}$ is correct (an error of the second kind). One of the basic problems in the classical theory of statistical hypotheses testing is the construction of a test that, given a definite upper bound $\alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta )$, $0 < \alpha < 1$, for the probability of an error of the first kind, would minimize the probability of an error of the second kind. The number $\alpha$ is called the significance level of the statistical test.

In practice, the most important are non-randomized statistical tests, i.e. those with as critical function $\phi _ {n} ( \cdot )$ the indicator function of a certain ${\mathcal B} _ {n}$- measurable set $K$ in $\mathfrak X$:

$$\phi _ {n} ( x) = \left \{ \begin{array}{ll} 1 & \textrm{ if } x \in K, \\ 0 & \textrm{ if } x \in \overline{K}\; = \mathfrak X _ {n} \setminus K. \\ \end{array} \right .$$

Thus, a non-randomized statistical test rejects $H _ {0}$ if the event $\{ X \in K \}$ takes place; on the other hand, if the event $\{ X \in \overline{K}\; \}$ takes place, then $H _ {0}$ is accepted. The set $K$ is called the critical region of the statistical test.

As a rule, a non-randomized statistical test is based on a certain statistic $T _ {n} = T _ {n} ( X)$, which is called the test statistic, and the critical region $K$ of this same test is usually defined using relations of the form $K = \{ {x } : {T _ {n} ( x) < t _ {1} } \}$, $K = \{ {x } : {T _ {n} ( x) > t _ {2} } \}$, $K = \{ {x } : {T _ {n} ( x) < t _ {1} } \} \cup \{ {x } : {T _ {n} ( x) > t _ {2} } \}$. The constants $t _ {1}$, $t _ {2}$, called the critical values of the test statistic $T _ {n}$, are defined from the condition $\alpha = \sup _ {\theta \in \Theta _ {0} } \beta _ {n} ( \theta )$; in these circumstances one speaks in the first two cases of one-sided statistical tests, and in the third case, of a two-sided statistical test. The structure of $T _ {n}$ reflects the particular nature of the competing hypotheses $H _ {0}$ and $H _ {1}$. In the case where the family $\{ { {\mathsf P} _ \theta ^ {n} } : {\theta \in \Theta } \}$ possesses a sufficient statistic $\Psi = \Psi ( X)$, it is natural to look for the test statistic in the class of sufficient statistics, since

$$\beta _ {n} ( \theta ) = {\mathsf E} _ \theta \phi _ {n} ( X) = {\mathsf E} _ \theta T _ {n} ( X)$$

for all $\theta \in \Theta = \Theta _ {0} \cup \Theta _ {1}$, where $T _ {n} ( X) = {\mathsf E} \{ \phi _ {n} ( X) \mid \Psi \}$.

#### References

 [1] E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) [2] J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967) [3] H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) [4] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957) [5] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) [6] M.S. Nikulin, "A result of Bol'shev's from the theory of the statistical testing of hypotheses" J. Soviet Math. , 44 : 3 (1989) pp. 522–529 Zap. Nauchn. Sem. Mat. Inst. Steklov. , 153 (1986) pp. 129–137
How to Cite This Entry:
Statistical test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Statistical_test&oldid=49603
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article