Chi-squared test
A test for the verification of a hypothesis $ H _ {0} $
according to which a random vector of frequencies $ \nu = ( \nu _ {1} \dots \nu _ {k} ) $
has a given polynomial distribution, characterized by a vector of positive probabilities $ p = ( p _ {1} \dots p _ {k} ) $,
$ p _ {1} + \dots + p _ {k} = 1 $.
The "chi-squared" test is based on the Pearson statistic
$$ X ^ {2} = \ \sum _ {i = 1 } ^ { k } \frac{( \nu _ {i} - np _ {i} ) ^ {2} }{np _ {i} } = \ { \frac{1}{n} } \sum \frac{\nu _ {i} ^ {2} }{p _ {i} } - n,\ \ n = \nu _ {1} + \dots + \nu _ {k} , $$
which has in the limit, as $ n \rightarrow \infty $, a "chi-squared" distribution with $ k - 1 $ degrees of freedom, that is,
$$ \lim\limits _ {n \rightarrow \infty } \ {\mathsf P} \{ X ^ {2} \leq x \mid H _ {0} \} = \ {\mathsf P} \{ \chi _ {k - 1 } ^ {2} \leq x \} . $$
According to the "chi-squared" test with significance level $ \approx \alpha $, the hypothesis $ H _ {0} $ must be rejected if $ X ^ {2} \geq \chi _ {k - 1 } ^ {2} ( \alpha ) $, where $ \chi _ {k - 1 } ^ {2} ( \alpha ) $ is the upper $ \alpha $- quantile of the "chi-squared" distribution with $ k - 1 $ degrees of freedom, that is,
$$ {\mathsf P} \{ \chi _ {k - 1 } ^ {2} \geq \chi _ {k - 1 } ^ {2} ( \alpha ) \} = \alpha . $$
The statistic $ X ^ {2} $ is also used to verify the hypothesis $ H _ {0} $ that the distribution functions of independent identically-distributed random variables $ X _ {1} \dots X _ {k} $ belong to a family of continuous functions $ F ( x, \theta ) $, $ x \in \mathbf R ^ {1} $, $ \theta = ( \theta _ {1} \dots \theta _ {m} ) \in \Theta \subset \mathbf R ^ {m} $, $ \Theta $ an open set. After dividing the real line by points $ x _ {0} < \dots < x _ {k} $, $ x _ {0} = - \infty $, $ x _ {k} = + \infty $, into $ k $ intervals $ ( x _ {0} , x _ {1} ] \dots ( x _ {k - 1 } , x _ {k} ) $, $ k > m $, such that for all $ \theta \in \Theta $,
$$ p _ {i} ( \theta ) = \ {\mathsf P} \{ X _ {i} \in ( x _ {i - 1 } , x _ {i} ] \} > 0, $$
$ i = 1 \dots k $; $ p _ {1} ( \theta ) + \dots + p _ {k} ( \theta ) = 1 $, one forms the frequency vector $ \nu = ( \nu _ {1} \dots \nu _ {k} ) $, which is obtained as a result of grouping the values of the random variables $ X _ {1} \dots X _ {n} $ into these intervals. Let
$$ X ^ {2} ( \theta ) = \ \sum _ {i = 1 } ^ { k } \frac{[ \nu _ {i} - np _ {i} ( \theta )] ^ {2} }{np _ {i} ( \theta ) } $$
be a random variable depending on the unknown parameter $ \theta $. To verify the hypothesis $ H _ {0} $ one uses the statistic $ X ^ {2} ( \widetilde \theta _ {n} ) $, where $ \widetilde \theta _ {n} $ is an estimator of the parameter $ \theta $, computed by the method of the minimum of "chi-squared" , that is,
$$ X ^ {2} ( \widetilde \theta _ {n} ) = \ \min _ {\theta \in \Theta } \ X ^ {2} ( \theta ). $$
If the intervals of the grouping are chosen so that all $ p _ {i} ( \theta ) > 0 $, if the functions $ \partial ^ {2} p _ {i} ( \theta )/ \partial \theta _ {j} \partial \theta _ {r} $ are continuous for all $ \theta \in \Theta $, $ i = 1 \dots k $; $ j, r = 1 \dots m $, and if the matrix $ \| \partial p _ {i} ( \theta )/ \partial \theta _ {j} \| $ has rank $ m $, then if the hypothesis $ H _ {0} $ is valid and as $ n \rightarrow \infty $, the statistic $ X ^ {2} ( \widetilde \theta _ {n} ) $ has in the limit a "chi-squared" distribution with $ k - m - 1 $ degrees of freedom, which can be used to verify $ H _ {0} $ by the "chi-squared" test. If one substitutes a maximum-likelihood estimator $ \widehat \theta _ {n} $ in $ X ^ {2} ( \theta ) $, computed from the non-grouped data $ X _ {1} \dots X _ {n} $, then under the validity of $ H _ {0} $ and as $ n \rightarrow \infty $, the statistic $ X ^ {2} ( \widehat \theta _ {n} ) $ is distributed in the limit like
$$ \xi _ {1} ^ {2} + \dots + \xi _ {k - m - 1 } ^ {2} + \mu _ {1} \xi _ {k - m } ^ {2} + \dots + \mu _ {m} \xi _ {k - 1 } ^ {2} , $$
where $ \xi _ {1} \dots \xi _ {k - 1 } $ are independent standard normally-distributed random variables, and the numbers $ \mu _ {1} \dots \mu _ {m} $ lie between 0 and 1 and, generally speaking, depend upon the unknown parameter $ \theta $. From this it follows that the use of maximum-likelihood estimators in applications of the "chi-squared" test for the verification of the hypothesis $ H _ {0} $ leads to difficulties connected with the computation of a non-standard limit distribution.
In [3]–[8] there are some recommendations concerning the $ \chi ^ {2} $- test in this case; in particular, in the normal case [3], the general continuous case [4], [8], the discrete case [6], [8], and in the problem of several samples [7].
References
[1] | M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 2. Inference and relationship , Griffin (1983) |
[2] | D.M. Chibisov, "Certain chi-square type tests for continuous distributions" Theory Probab. Appl. , 16 : 1 (1971) pp. 1–22 Teor. Veroyatnost. i Primenen. , 16 : 1 (1971) pp. 3–20 |
[3] | M.S. Nikulin, "Chi-square test for continuous distributions with shift and scale parameters" Theory Probab. Appl. , 18 : 3 (1973) pp. 559–568 Teor. Veroyatnost. i Primenen. , 18 : 3 (1973) pp. 583–592 |
[4] | K.O. Dzhaparidze, M.S. Nikulin, "On a modification of the standard statistics of Pearson" Theor. Probab. Appl. , 19 : 4 (1974) pp. 851–853 Teor. Veroyatnost. i Primenen. , 19 : 4 (1974) pp. 886–888 |
[5] | M.S. Nikulin, "On a quantile test" Theory Probab. Appl. , 19 : 2 (1974) pp. 410–413 Teor. Veroyatnost. i Primenen. : 2 (1974) pp. 410–414 |
[6] | L.N. Bol'shev, M. Mirvaliev, "Chi-square goodness-of-fit test for the Poisson, binomial and negative binomial distributions" Theory Probab. Appl. , 23 : 3 (1974) pp. 461–474 Teor. Veroyatnost. i Primenen. , 23 : 3 (1978) pp. 481–494 |
[7] | L.N. Bol'shev, M.S. Nikulin, "A certain solution of the homogeneity problem" Serdica , 1 (1975) pp. 104–109 (In Russian) |
[8] | P.E. Greenwood, M.S. Nikulin, "Investigations in the theory of probabilities distributions. X" Zap. Nauchn. Sem. Leningr. Otdel. Mat. Inst. Steklov. , 156 (1987) pp. 42–65 (In Russian) |
Comments
The "chi-squared" test is also called the "chi-square" test or $ \chi ^ {2} $- test.
Chi-squared test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Chi-squared_test&oldid=28552