Namespaces
Variants
Actions

Difference between revisions of "Kolmogorov-Smirnov test"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
A [[Non-parametric test|non-parametric test]] used for testing a hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557401.png" />, according to which independent random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557402.png" /> have a given continuous distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557403.png" />, against the one-sided alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557404.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557405.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557406.png" /> is the mathematical expectation of the empirical distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557407.png" />. The Kolmogorov–Smirnov test is constructed from the statistic
+
<!--
 +
k0557401.png
 +
$#A+1 = 36 n = 0
 +
$#C+1 = 36 : ~/encyclopedia/old_files/data/K055/K.0505740 Kolmogorov\ANDSmirnov test
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557408.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k0557409.png" /> is the [[Variational series|variational series]] (or set of order statistics) obtained from the sample <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574010.png" />. Thus, the Kolmogorov–Smirnov test is a variant of the [[Kolmogorov test|Kolmogorov test]] for testing the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574011.png" /> against a one-sided alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574012.png" />. By studying the distribution of the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574013.png" />, N.V. Smirnov [[#References|[1]]] showed that
+
{{MSC|62G10}}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574014.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
[[Category:Nonparametric inference]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574015.png" /></td> </tr></table>
+
A [[Non-parametric test|non-parametric test]] used for testing a hypothesis  $  H _ {0} $,
 +
according to which independent random variables  $  X _ {1} \dots X _ {n} $
 +
have a given continuous distribution function  $  F $,
 +
against the one-sided alternative  $  H _ {1}  ^ {+} $:  
 +
$  \sup _ {| x|<\infty }  ( {\mathsf E} F _ {n} ( x) - F ( x) ) > 0 $,
 +
where  $  {\mathsf E} F _ {n} $
 +
is the mathematical expectation of the empirical distribution function  $  F _ {n} $.  
 +
The Kolmogorov–Smirnov test is constructed from the statistic
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574016.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574017.png" /> is the integer part of the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574018.png" />. Smirnov obtained in addition to the exact distribution (1) of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574019.png" /> its limit distribution, namely: If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574020.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574021.png" />, then
+
$$
 +
D _ {n}  ^ {+}  = \
 +
\sup _
 +
{| x | < \infty } \
 +
( F _ {n} ( x) - F ( x) ) = \
 +
\max _
 +
{1 \leq  m \leq  n } \
 +
\left (
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574022.png" /></td> </tr></table>
+
\frac{m}{n}
 +
- F ( X _ {(} m) )
 +
\right ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574023.png" /> is any positive number. By means of the technique of asymptotic Pearson transformation it has been proved [[#References|[2]]] that if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574024.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574025.png" />, then
+
where $  X _ {(} 1) \leq  \dots \leq  X _ {(} n) $
 +
is the [[Variational series|variational series]] (or set of order statistics) obtained from the sample  $  X _ {1} \dots X _ {n} $.  
 +
Thus, the Kolmogorov–Smirnov test is a variant of the [[Kolmogorov test|Kolmogorov test]] for testing the hypothesis  $  H _ {0} $
 +
against a one-sided alternative  $  H _ {1}  ^ {+} $.  
 +
By studying the distribution of the statistic  $  D _ {n}  ^ {+} $,
 +
N.V. Smirnov [[#References|[1]]] showed that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574026.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{1 }
 +
{\mathsf P} \{ D _ {n}  ^ {+} \geq  \lambda \} =
 +
$$
  
According to the Kolmogorov–Smirnov test, the hypothesis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574027.png" /> must be rejected with significance level <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574028.png" /> whenever
+
$$
 +
= \
 +
\sum _ { k= } 0 ^ { {[ }  n ( 1 - \lambda ) ]
 +
} \lambda \left ( \begin{array}{c}
 +
n \\
 +
k
 +
\end{array}
 +
\right ) \left
 +
( \lambda +
 +
\frac{k}{n}
 +
\right )  ^ {k-} 1 \left ( 1 - \lambda -
 +
\frac{k}{n}
 +
\right )  ^ {n-} k ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574029.png" /></td> </tr></table>
+
where  $  0 < \lambda < 1 $
 +
and  $  [ a ] $
 +
is the integer part of the number  $  a $.
 +
Smirnov obtained in addition to the exact distribution (1) of  $  D _ {n} $
 +
its limit distribution, namely: If  $  n \rightarrow \infty $
 +
and  $  0 < \lambda _ {0} < \lambda = O ( n  ^ {1/6} ) $,
 +
then
 +
 
 +
$$
 +
{\mathsf P} \{ D _ {n}  ^ {+} \geq  \lambda \}  = \
 +
e ^ {- 2 \lambda  ^ {2} }
 +
\left [ 1 + O \left (
 +
\frac{1}{\sqrt n}
 +
\right )  \right ] ,
 +
$$
 +
 
 +
where  $  \lambda _ {0} $
 +
is any positive number. By means of the technique of asymptotic Pearson transformation it has been proved [[#References|[2]]] that if  $  n \rightarrow \infty $
 +
and  $  0 < \lambda _ {0} < \lambda = O ( n  ^ {1/3} ) $,
 +
then
 +
 
 +
$$ \tag{2 }
 +
{\mathsf P}
 +
\left \{
 +
 
 +
\frac{1}{18n}
 +
( 6 n D _ {n}  ^ {+} + 1 )  ^ {2} \geq  \lambda
 +
\right \}
 +
=  e ^ {- \lambda }
 +
\left [ 1 + O \left (
 +
\frac{1}{n}
 +
\right )  \right ] .
 +
$$
 +
 
 +
According to the Kolmogorov–Smirnov test, the hypothesis  $  H _ {0} $
 +
must be rejected with significance level  $  \alpha $
 +
whenever
 +
 
 +
$$
 +
\mathop{\rm exp} \
 +
\left [
 +
 
 +
\frac{( - 6 n D _ {n}  ^ {+} + 1 )  ^ {2} }{18n}
 +
 
 +
\right ]
 +
\leq  \alpha ,
 +
$$
  
 
where, by virtue of (2),
 
where, by virtue of (2),
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574030.png" /></td> </tr></table>
+
$$
 +
{\mathsf P}
 +
\left \{
 +
\mathop{\rm exp} \
 +
\left [
 +
 
 +
\frac{( - 6 n D _ {n}  ^ {+} + 1 )  ^ {2} }{18n}
 +
 
 +
\right ]
 +
\leq  \alpha
 +
\right \}
 +
= \alpha
 +
\left (
 +
1 + O \left (
 +
\frac{1}{n}
 +
\right ) \
 +
\right ) .
 +
$$
 +
 
 +
The testing of  $  H _ {0} $
 +
against the alternative  $  H _ {1}  ^ {-} $:  
 +
$  \inf _ {| x | < \infty }  ( {\mathsf E} F _ {n} ( x) - F ( x) ) < 0 $
 +
is dealt with similarly. In this case the statistic of the Kolmogorov–Smirnov test is the random variable
  
The testing of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574031.png" /> against the alternative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574032.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574033.png" /> is dealt with similarly. In this case the statistic of the Kolmogorov–Smirnov test is the random variable
+
$$
 +
D _ {n}  ^ {-}  = -
 +
\inf _
 +
{| x | < \infty } \
 +
( F _ {n} ( x) - F ( x) )  = \
 +
\max _
 +
{1 \leq  m \leq  n } \
 +
\left (
 +
F ( X _ {(} m) ) - m-
 +
\frac{1}{n}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574034.png" /></td> </tr></table>
+
\right ) ,
 +
$$
  
whose distribution is the same as that of the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574035.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055740/k05574036.png" /> is true.
+
whose distribution is the same as that of the statistic $  D _ {n}  ^ {+} $
 +
when $  H _ {0} $
 +
is true.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  N.V. Smirnov,  "Approximate distribution laws for random variables, constructed from empirical data"  ''Uspekhi Mat. Nauk'' , '''10'''  (1944)  pp. 179–206  (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  L.N. Bol'shev,  "Asymptotically Pearson transformations"  ''Theor. Probab. Appl.'' , '''8'''  (1963)  pp. 121–146  ''Teor. Veroyatnost. i Primenen.'' , '''8''' :  2  (1963)  pp. 129–155</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  N.V. Smirnov,  "Approximate distribution laws for random variables, constructed from empirical data"  ''Uspekhi Mat. Nauk'' , '''10'''  (1944)  pp. 179–206  (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  L.N. Bol'shev,  "Asymptotically Pearson transformations"  ''Theor. Probab. Appl.'' , '''8'''  (1963)  pp. 121–146  ''Teor. Veroyatnost. i Primenen.'' , '''8''' :  2  (1963)  pp. 129–155</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====

Latest revision as of 22:14, 5 June 2020


2010 Mathematics Subject Classification: Primary: 62G10 [MSN][ZBL]

A non-parametric test used for testing a hypothesis $ H _ {0} $, according to which independent random variables $ X _ {1} \dots X _ {n} $ have a given continuous distribution function $ F $, against the one-sided alternative $ H _ {1} ^ {+} $: $ \sup _ {| x|<\infty } ( {\mathsf E} F _ {n} ( x) - F ( x) ) > 0 $, where $ {\mathsf E} F _ {n} $ is the mathematical expectation of the empirical distribution function $ F _ {n} $. The Kolmogorov–Smirnov test is constructed from the statistic

$$ D _ {n} ^ {+} = \ \sup _ {| x | < \infty } \ ( F _ {n} ( x) - F ( x) ) = \ \max _ {1 \leq m \leq n } \ \left ( \frac{m}{n} - F ( X _ {(} m) ) \right ) , $$

where $ X _ {(} 1) \leq \dots \leq X _ {(} n) $ is the variational series (or set of order statistics) obtained from the sample $ X _ {1} \dots X _ {n} $. Thus, the Kolmogorov–Smirnov test is a variant of the Kolmogorov test for testing the hypothesis $ H _ {0} $ against a one-sided alternative $ H _ {1} ^ {+} $. By studying the distribution of the statistic $ D _ {n} ^ {+} $, N.V. Smirnov [1] showed that

$$ \tag{1 } {\mathsf P} \{ D _ {n} ^ {+} \geq \lambda \} = $$

$$ = \ \sum _ { k= } 0 ^ { {[ } n ( 1 - \lambda ) ] } \lambda \left ( \begin{array}{c} n \\ k \end{array} \right ) \left ( \lambda + \frac{k}{n} \right ) ^ {k-} 1 \left ( 1 - \lambda - \frac{k}{n} \right ) ^ {n-} k , $$

where $ 0 < \lambda < 1 $ and $ [ a ] $ is the integer part of the number $ a $. Smirnov obtained in addition to the exact distribution (1) of $ D _ {n} $ its limit distribution, namely: If $ n \rightarrow \infty $ and $ 0 < \lambda _ {0} < \lambda = O ( n ^ {1/6} ) $, then

$$ {\mathsf P} \{ D _ {n} ^ {+} \geq \lambda \} = \ e ^ {- 2 \lambda ^ {2} } \left [ 1 + O \left ( \frac{1}{\sqrt n} \right ) \right ] , $$

where $ \lambda _ {0} $ is any positive number. By means of the technique of asymptotic Pearson transformation it has been proved [2] that if $ n \rightarrow \infty $ and $ 0 < \lambda _ {0} < \lambda = O ( n ^ {1/3} ) $, then

$$ \tag{2 } {\mathsf P} \left \{ \frac{1}{18n} ( 6 n D _ {n} ^ {+} + 1 ) ^ {2} \geq \lambda \right \} = e ^ {- \lambda } \left [ 1 + O \left ( \frac{1}{n} \right ) \right ] . $$

According to the Kolmogorov–Smirnov test, the hypothesis $ H _ {0} $ must be rejected with significance level $ \alpha $ whenever

$$ \mathop{\rm exp} \ \left [ \frac{( - 6 n D _ {n} ^ {+} + 1 ) ^ {2} }{18n} \right ] \leq \alpha , $$

where, by virtue of (2),

$$ {\mathsf P} \left \{ \mathop{\rm exp} \ \left [ \frac{( - 6 n D _ {n} ^ {+} + 1 ) ^ {2} }{18n} \right ] \leq \alpha \right \} = \alpha \left ( 1 + O \left ( \frac{1}{n} \right ) \ \right ) . $$

The testing of $ H _ {0} $ against the alternative $ H _ {1} ^ {-} $: $ \inf _ {| x | < \infty } ( {\mathsf E} F _ {n} ( x) - F ( x) ) < 0 $ is dealt with similarly. In this case the statistic of the Kolmogorov–Smirnov test is the random variable

$$ D _ {n} ^ {-} = - \inf _ {| x | < \infty } \ ( F _ {n} ( x) - F ( x) ) = \ \max _ {1 \leq m \leq n } \ \left ( F ( X _ {(} m) ) - m- \frac{1}{n} \right ) , $$

whose distribution is the same as that of the statistic $ D _ {n} ^ {+} $ when $ H _ {0} $ is true.

References

[1] N.V. Smirnov, "Approximate distribution laws for random variables, constructed from empirical data" Uspekhi Mat. Nauk , 10 (1944) pp. 179–206 (In Russian)
[2] L.N. Bol'shev, "Asymptotically Pearson transformations" Theor. Probab. Appl. , 8 (1963) pp. 121–146 Teor. Veroyatnost. i Primenen. , 8 : 2 (1963) pp. 129–155
[3] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[4] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)

Comments

There is also a two-sample Kolmogorov–Smirnov test, cf. the editorial comments to Kolmogorov test and, for details, [a1], [a2].

References

[a1] G.E. Noether, "A brief survey of nonparametric statistics" R.V. Hogg (ed.) , Studies in statistics , Math. Assoc. Amer. (1978) pp. 39–65
[a2] M. Hollander, D.A. Wolfe, "Nonparametric statistical methods" , Wiley (1973)
How to Cite This Entry:
Kolmogorov-Smirnov test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Kolmogorov-Smirnov_test&oldid=17054
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article