Difference between revisions of "Multinomial distribution"
(refs format) |
(latex details) |
||
(5 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | m0653301.png | ||
+ | $#A+1 = 40 n = 0 | ||
+ | $#C+1 = 40 : ~/encyclopedia/old_files/data/M065/M.0605330 Multinomial distribution, | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
''polynomial distribution'' | ''polynomial distribution'' | ||
Line 5: | Line 17: | ||
[[Category:Distribution theory]] | [[Category:Distribution theory]] | ||
− | The joint distribution of random variables | + | The joint distribution of random variables $ X _ {1} \dots X _ {k} $ |
+ | that is defined for any set of non-negative integers $ n _ {1} \dots n _ {k} $ | ||
+ | satisfying the condition $ n _ {1} + \dots + n _ {k} = n $, | ||
+ | $ n _ {j} = 0 \dots n $, | ||
+ | $ j = 1 \dots k $, | ||
+ | by the formula | ||
− | + | $$ \tag{* } | |
+ | {\mathsf P} \{ X _ {1} = n _ {1} \dots X _ {k} = n _ {k} \} = \ | ||
+ | \frac{n!}{n _ {1} ! \dots n _ {k} ! } p _ {1} ^ {n _ {1} } \dots p _ {k} ^ {n _ {k} } , | ||
+ | $$ | ||
− | where | + | where $ n, p _ {1} \dots p _ {k} $( |
+ | $ p _ {j} \geq 0 $, | ||
+ | $ \sum p _ {j} = 1 $) | ||
+ | are the parameters of the distribution. A multinomial distribution is a multivariate discrete distribution, namely the distribution for the random vector $ ( X _ {1} \dots X _ {k} ) $ | ||
+ | with $ X _ {1} + \dots + X _ {k} = n $( | ||
+ | this distribution is in essence $ ( k- 1) $- | ||
+ | dimensional, since it is degenerate in the Euclidean space of $ k $ | ||
+ | dimensions). A multinomial distribution is a natural generalization of a [[Binomial distribution|binomial distribution]] and coincides with the latter for $ k = 2 $. | ||
+ | The name of the distribution is given because the probability (*) is the general term in the expansion of the multinomial $ ( p _ {1} + \dots + p _ {k} ) ^ {n} $. | ||
+ | The multinomial distribution appears in the following probability scheme. Each of the random variables $ X _ {i} $ | ||
+ | is the number of occurrences of one of the mutually exclusive events $ A _ {j} $, | ||
+ | $ j = 1 \dots k $, | ||
+ | in repeated independent trials. If in each trial the probability of event $ A _ {j} $ | ||
+ | is $ p _ {j} $, | ||
+ | $ j = 1 \dots k $, | ||
+ | then the probability (*) is equal to the probability that in $ n $ | ||
+ | trials the events $ A _ {1} \dots A _ {k} $ | ||
+ | will appear $ n _ {1} \dots n _ {k} $ | ||
+ | times, respectively. Each of the random variables $ X _ {j} $ | ||
+ | has a binomial distribution with mathematical expectation $ np _ {j} $ | ||
+ | and variance $ np _ {j} ( 1- p _ {j} ) $. | ||
− | The random vector | + | The random vector $ ( X _ {1} \dots X _ {k} ) $ |
+ | has mathematical expectation $ ( np _ {1} \dots np _ {k} ) $ | ||
+ | and covariance matrix $ B = \| b _ {ij} \| $, | ||
+ | where | ||
− | + | $$ | |
+ | b _ {ij} = \left \{ | ||
+ | \begin{array}{ll} | ||
+ | np _ {i} ( 1- p _ {i} ), & i = j, \\ | ||
+ | - np _ {i} p _ {j} , & i \neq j, \\ | ||
+ | \end{array} | ||
+ | \ \ | ||
+ | i, j = 1 \dots k | ||
+ | \right .$$ | ||
− | (the rank of the matrix | + | (the rank of the matrix $ B $ |
+ | is $ k- 1 $ | ||
+ | because $ \sum_{i=1} ^ {k} n _ {i} = n $). | ||
+ | The characteristic function of a multinomial distribution is | ||
− | + | $$ | |
+ | f( t _ {1} \dots t _ {k} ) = \left ( p _ {1} e ^ {it _ {1} } + \dots + p _ {k} e ^ {it _ {k} } \right ) ^ {n} . | ||
+ | $$ | ||
− | For | + | For $ n \rightarrow \infty $, |
+ | the distribution of the vector $ ( Y _ {1} \dots Y _ {k} ) $ | ||
+ | with normalized components | ||
− | + | $$ | |
+ | Y _ {i} = \ | ||
+ | |||
+ | \frac{X _ {i} - np _ {i} }{\sqrt {np _ {i} ( 1- p _ {i} ) } } | ||
+ | |||
+ | $$ | ||
tends to a certain multivariate [[Normal distribution|normal distribution]], while the distribution of the sum | tends to a certain multivariate [[Normal distribution|normal distribution]], while the distribution of the sum | ||
− | + | $$ | |
+ | \sum_{i=1} ^ { k } ( 1 - p _ {i} ) Y _ {i} ^ {2} | ||
+ | $$ | ||
− | (which is used in mathematical statistics to construct the [[ | + | (which is used in mathematical statistics to construct the [[Chi-squared distribution| "chi-squared" test]]) tends to the [[Chi-squared test| "chi-squared" distribution]] with $ k- 1 $ |
+ | degrees of freedom. | ||
====References==== | ====References==== | ||
Line 35: | Line 101: | ||
====Comments==== | ====Comments==== | ||
− | |||
====References==== | ====References==== |
Latest revision as of 13:11, 6 January 2024
polynomial distribution
2020 Mathematics Subject Classification: Primary: 60E99 [MSN][ZBL]
The joint distribution of random variables $ X _ {1} \dots X _ {k} $ that is defined for any set of non-negative integers $ n _ {1} \dots n _ {k} $ satisfying the condition $ n _ {1} + \dots + n _ {k} = n $, $ n _ {j} = 0 \dots n $, $ j = 1 \dots k $, by the formula
$$ \tag{* } {\mathsf P} \{ X _ {1} = n _ {1} \dots X _ {k} = n _ {k} \} = \ \frac{n!}{n _ {1} ! \dots n _ {k} ! } p _ {1} ^ {n _ {1} } \dots p _ {k} ^ {n _ {k} } , $$
where $ n, p _ {1} \dots p _ {k} $( $ p _ {j} \geq 0 $, $ \sum p _ {j} = 1 $) are the parameters of the distribution. A multinomial distribution is a multivariate discrete distribution, namely the distribution for the random vector $ ( X _ {1} \dots X _ {k} ) $ with $ X _ {1} + \dots + X _ {k} = n $( this distribution is in essence $ ( k- 1) $- dimensional, since it is degenerate in the Euclidean space of $ k $ dimensions). A multinomial distribution is a natural generalization of a binomial distribution and coincides with the latter for $ k = 2 $. The name of the distribution is given because the probability (*) is the general term in the expansion of the multinomial $ ( p _ {1} + \dots + p _ {k} ) ^ {n} $. The multinomial distribution appears in the following probability scheme. Each of the random variables $ X _ {i} $ is the number of occurrences of one of the mutually exclusive events $ A _ {j} $, $ j = 1 \dots k $, in repeated independent trials. If in each trial the probability of event $ A _ {j} $ is $ p _ {j} $, $ j = 1 \dots k $, then the probability (*) is equal to the probability that in $ n $ trials the events $ A _ {1} \dots A _ {k} $ will appear $ n _ {1} \dots n _ {k} $ times, respectively. Each of the random variables $ X _ {j} $ has a binomial distribution with mathematical expectation $ np _ {j} $ and variance $ np _ {j} ( 1- p _ {j} ) $.
The random vector $ ( X _ {1} \dots X _ {k} ) $ has mathematical expectation $ ( np _ {1} \dots np _ {k} ) $ and covariance matrix $ B = \| b _ {ij} \| $, where
$$ b _ {ij} = \left \{ \begin{array}{ll} np _ {i} ( 1- p _ {i} ), & i = j, \\ - np _ {i} p _ {j} , & i \neq j, \\ \end{array} \ \ i, j = 1 \dots k \right .$$
(the rank of the matrix $ B $ is $ k- 1 $ because $ \sum_{i=1} ^ {k} n _ {i} = n $). The characteristic function of a multinomial distribution is
$$ f( t _ {1} \dots t _ {k} ) = \left ( p _ {1} e ^ {it _ {1} } + \dots + p _ {k} e ^ {it _ {k} } \right ) ^ {n} . $$
For $ n \rightarrow \infty $, the distribution of the vector $ ( Y _ {1} \dots Y _ {k} ) $ with normalized components
$$ Y _ {i} = \ \frac{X _ {i} - np _ {i} }{\sqrt {np _ {i} ( 1- p _ {i} ) } } $$
tends to a certain multivariate normal distribution, while the distribution of the sum
$$ \sum_{i=1} ^ { k } ( 1 - p _ {i} ) Y _ {i} ^ {2} $$
(which is used in mathematical statistics to construct the "chi-squared" test) tends to the "chi-squared" distribution with $ k- 1 $ degrees of freedom.
References
[C] | H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) MR0016588 Zbl 0063.01014 |
Comments
References
[JK] | N.L. Johnson, S. Kotz, "Discrete distributions" , Wiley (1969) MR0268996 Zbl 0292.62009 |
Multinomial distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Multinomial_distribution&oldid=26633