Namespaces
Variants
Actions

Smirnov test

From Encyclopedia of Mathematics
Revision as of 08:14, 6 June 2020 by Ulf Rehmann (talk | contribs) (tex encoded by computer)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Smirnov $ 2 $- samples test

A non-parametric (or distribution-free) statistical test for testing hypotheses about the homogeneity of two samples.

Let $ X _ {1} \dots X _ {n} $ and $ Y _ {1} \dots Y _ {m} $ be mutually-independent random variables, where each sample consists of identically continuously distributed elements, and suppose one wishes to test the hypothesis $ H _ {0} $ that both samples are taken from the same population. If

$$ X _ {(} 1) \leq \dots \leq X _ {(} n) \ \ \textrm{ and } \ Y _ {(} 1) \leq \dots \leq Y _ {(} m) $$

are the order statistics corresponding to the given samples, and $ F _ {n} ( x) $ and $ G _ {m} ( x) $ are the empirical distribution functions corresponding to them, then $ H _ {0} $ can be written in the form of the identity:

$$ H _ {0} :\ {\mathsf E} F _ {n} ( x) \equiv {\mathsf E} G _ {m} ( x) . $$

Further, consider the following hypotheses as possible alternatives to $ H _ {0} $:

$$ H _ {1} ^ {+} :\ \sup _ {| x | < \infty } {\mathsf E} [ G _ {m} ( x) - F _ {n} ( x) ] > 0 , $$

$$ H _ {1} ^ {-} : \ \inf _ {| x | < \infty } {\mathsf E} [ G _ {m} ( x) - F _ {n} ( x) ] < 0 , $$

$$ H _ {1} : \ \sup _ {| x | < \infty } | {\mathsf E} [ G _ {m} ( x) - F _ {n} ( x) ] | > 0 . $$

To test $ H _ {0} $ against the one-sided alternatives $ H _ {1} ^ {+} $ and $ H _ {1} ^ {-} $, and also against the two-sided $ H _ {1} $, N.V. Smirnov proposed a test based on the statistics

$$ D _ {m,n} ^ {+} = \sup _ {| x | < \infty } [ G _ {m} ( x) - F _ {n} ( x) ] = $$

$$ = \ \max _ {1 \leq k \leq m } \left ( \frac{k}{m} - F _ {n} ( Y _ {(} k) ) \right ) = \max _ {1 \leq s \leq n } \left ( G _ {m} ( X _ {(} s) ) - s- \frac{1}{n} \right ) , $$

$$ D _ {m,n} ^ {-} = - \inf _ {| x| < \infty } [ G _ {m} ( x) - F _ {n} ( x) ] = $$

$$ = \ \max _ {i \leq k \leq m } \left ( F _ {n} ( Y _ {(} k) ) - k- \frac{1}{m} \right ) = \max _ {1 \leq s \leq n } \left ( \frac{s}{n} - G _ {m} ( X _ {(} s) ) \right ) , $$

$$ D _ {m,n} = \sup _ {| x | < \infty } | G _ {m} ( x) - F _ {n} ( x) | = \max ( D _ {m,n} ^ {+} , D _ {m,n} ^ {-} ), $$

respectively, where it follows from the definitions of $ D _ {m,n} ^ {+} $ and $ D _ {m,n} ^ {-} $ that under the hypothesis $ H _ {0} $, $ D _ {m,n} ^ {+} $ and $ D _ {m,n} ^ {-} $ have the same distribution. Asymptotic tests can be based on the following theorem: If $ \min ( m , n ) \rightarrow \infty $, then the validity of $ H _ {0} $ implies that

$$ \lim\limits _ {m \rightarrow \infty } {\mathsf P} \left \{ \sqrt { \frac{mn}{m+} n } D _ {m,n} ^ {+} < y \right \} = 1 - e ^ {- 2 y ^ {2} } ,\ y > 0 , $$

$$ \lim\limits _ {m \rightarrow \infty } {\mathsf P} \left \{ \sqrt { \frac{mn}{m+} n } D _ {m,n} < y \right \} = K ( y) ,\ y > 0 , $$

where $ K ( y) $ is the Kolmogorov distribution function (cf. Statistical estimator). Asymptotic expansions for the distribution functions of the statistics $ D _ {m,n} ^ {+} $ and $ D _ {m,n} ^ {-} $ have been found (see [4][6]).

Using the Smirnov test with significance level $ \alpha $, $ H _ {0} $ may be rejected in favour of one of the above alternatives $ H _ {1} ^ {+} $, $ H _ {1} ^ {-} $ when the corresponding statistic exceeds the $ \alpha $- critical value of the test; this value can be calculated using the approximations obtained by L.N. Bol'shev [2] by means of Pearson asymptotic transformations.

See also Kolmogorov test; Kolmogorov–Smirnov test.

References

[1] N.V. Smirnov, "Estimates of the divergence between empirical distribution curves in two independent samples" Byull. Moskov. Gosudarstv. Univ. (A) , 2 : 2 (1939) pp. 3–14
[2] L.N. Bol'shev, "Asymptotically Pearson transformations" Theor. Probab. Appl. , 8 (1963) pp. 121–146 Teor. Veroyatnost. i Primenen. , 8 : 2 (1963) pp. 129–155
[3] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[4] V.S. Korolyuk, "Asymptotic analysis of the distribution of the maximum deviation in the Bernoulli scheme" Theor. Probab. Appl. , 4 (1959) pp. 339–366 Teor. Veroyatnost. i Primenen. , 4 (1959) pp. 369–397
[5] Li-Chien Chang, "On the exact distribution of A.N. Kolmogorov's statistic and its asymptotic expansion (I and II)" Matematika , 4 : 2 (1960) pp. 135–139 (In Russian)
[6] A.A. Borovkov, "On the two-sample problem" Izv. Akad. Nauk SSSR Ser. Mat. , 26 : 4 (1962) pp. 605–624 (In Russian)

Comments

References

[a1] D.B. Owen, "A handbook of statistical tables" , Addison-Wesley (1962)
[a2] E.S. Pearson, H.O. Hartley, "Biometrika tables for statisticians" , 2 , Cambridge Univ. Press (1972)
How to Cite This Entry:
Smirnov test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Smirnov_test&oldid=18202
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article