Multinomial distribution
polynomial distribution
2020 Mathematics Subject Classification: Primary: 60E99 [MSN][ZBL]
The joint distribution of random variables that is defined for any set of non-negative integers n _ {1} \dots n _ {k} satisfying the condition n _ {1} + \dots + n _ {k} = n , n _ {j} = 0 \dots n , j = 1 \dots k , by the formula
\tag{* } {\mathsf P} \{ X _ {1} = n _ {1} \dots X _ {k} = n _ {k} \} = \ \frac{n!}{n _ {1} ! \dots n _ {k} ! } p _ {1} ^ {n _ {1} } \dots p _ {k} ^ {n _ {k} } ,
where n, p _ {1} \dots p _ {k} ( p _ {j} \geq 0 , \sum p _ {j} = 1 ) are the parameters of the distribution. A multinomial distribution is a multivariate discrete distribution, namely the distribution for the random vector ( X _ {1} \dots X _ {k} ) with X _ {1} + \dots + X _ {k} = n ( this distribution is in essence ( k- 1) - dimensional, since it is degenerate in the Euclidean space of k dimensions). A multinomial distribution is a natural generalization of a binomial distribution and coincides with the latter for k = 2 . The name of the distribution is given because the probability (*) is the general term in the expansion of the multinomial ( p _ {1} + \dots + p _ {k} ) ^ {n} . The multinomial distribution appears in the following probability scheme. Each of the random variables X _ {i} is the number of occurrences of one of the mutually exclusive events A _ {j} , j = 1 \dots k , in repeated independent trials. If in each trial the probability of event A _ {j} is p _ {j} , j = 1 \dots k , then the probability (*) is equal to the probability that in n trials the events A _ {1} \dots A _ {k} will appear n _ {1} \dots n _ {k} times, respectively. Each of the random variables X _ {j} has a binomial distribution with mathematical expectation np _ {j} and variance np _ {j} ( 1- p _ {j} ) .
The random vector ( X _ {1} \dots X _ {k} ) has mathematical expectation ( np _ {1} \dots np _ {k} ) and covariance matrix B = \| b _ {ij} \| , where
b _ {ij} = \left \{ \begin{array}{ll} np _ {i} ( 1- p _ {i} ), & i = j, \\ - np _ {i} p _ {j} , & i \neq j, \\ \end{array} \ \ i, j = 1 \dots k \right .
(the rank of the matrix B is k- 1 because \sum_{i=1} ^ {k} n _ {i} = n ). The characteristic function of a multinomial distribution is
f( t _ {1} \dots t _ {k} ) = \left ( p _ {1} e ^ {it _ {1} } + \dots + p _ {k} e ^ {it _ {k} } \right ) ^ {n} .
For n \rightarrow \infty , the distribution of the vector ( Y _ {1} \dots Y _ {k} ) with normalized components
Y _ {i} = \ \frac{X _ {i} - np _ {i} }{\sqrt {np _ {i} ( 1- p _ {i} ) } }
tends to a certain multivariate normal distribution, while the distribution of the sum
\sum_{i=1} ^ { k } ( 1 - p _ {i} ) Y _ {i} ^ {2}
(which is used in mathematical statistics to construct the "chi-squared" test) tends to the "chi-squared" distribution with k- 1 degrees of freedom.
References
[C] | H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) MR0016588 Zbl 0063.01014 |
Comments
References
[JK] | N.L. Johnson, S. Kotz, "Discrete distributions" , Wiley (1969) MR0268996 Zbl 0292.62009 |
Multinomial distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Multinomial_distribution&oldid=54889