Difference between revisions of "Binomial distribution"
(→References: Gnedenko: internal link) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
(2 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | b0164201.png | ||
+ | $#A+1 = 49 n = 0 | ||
+ | $#C+1 = 49 : ~/encyclopedia/old_files/data/B016/B.0106420 Binomial distribution, | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
''Bernoulli distribution'' | ''Bernoulli distribution'' | ||
Line 5: | Line 17: | ||
[[Category:Distribution theory]] | [[Category:Distribution theory]] | ||
− | The probability distribution of a random variable | + | The probability distribution of a random variable $ X $ |
+ | which assumes integral values $ x = 0 \dots n $ | ||
+ | with the probabilities | ||
− | + | $$ | |
+ | {\mathsf P} \{ X=x \} = b _ {x} (n, p) = \ | ||
+ | \left ( \begin{array}{c} | ||
+ | n \\ | ||
+ | x | ||
+ | \end{array} | ||
+ | \right ) p ^ {x} (1-p) ^ {n-x} , | ||
+ | $$ | ||
− | where | + | where $ ( {} _ {x} ^ {n} ) $ |
+ | is the binomial coefficient, and $ p $ | ||
+ | is a parameter of the binomial distribution, called the probability of a positive outcome, which can take values in the interval $ 0 \leq p \leq 1 $. | ||
+ | The binomial distribution is one of the fundamental probability distributions connected with a sequence of independent trials. Let $ Y _ {1} , Y _ {2} \dots $ | ||
+ | be a sequence of independent random variables, each one of which may assume only one of the values 1 and 0 with respective probabilities $ p $ | ||
+ | and $ 1 - p $( | ||
+ | i.e. all $ Y _ {i} $ | ||
+ | are binomially distributed with $ n = 1 $). | ||
+ | The values of $ Y _ {i} $ | ||
+ | may be treated as the results of independent trials, with $ Y _ {i} = 1 $ | ||
+ | if the result of the $ i $- | ||
+ | th trial is "positive" and $ Y _ {i} = 0 $ | ||
+ | if it is "negative" . If the total number of independent trials $ n $ | ||
+ | is fixed, such a scheme is known as [[Bernoulli trials|Bernoulli trials]], and the total number of positive results, | ||
− | + | $$ | |
+ | X=Y _ {1} + \dots + | ||
+ | Y _ {n} ,\ \ | ||
+ | n \geq 1 , | ||
+ | $$ | ||
− | is then binomially distributed with parameter | + | is then binomially distributed with parameter $ p $. |
− | The mathematical expectation | + | The mathematical expectation $ {\mathsf E} z ^ {X} $( |
+ | the [[Generating function|generating function]] of the binomial distribution) for any value of $ z $ | ||
+ | is the polynomial $ [pz + (1 - p)] ^ {n} $, | ||
+ | the representation of which by Newton's binomial series has the form | ||
− | + | $$ | |
+ | b _ {0} + b _ {1} z + \dots + b _ {n} z ^ {n} . | ||
+ | $$ | ||
(Hence the very name "binomial distribution" .) The moments (cf. [[Moment|Moment]]) of a binomial distribution are given by the formulas | (Hence the very name "binomial distribution" .) The moments (cf. [[Moment|Moment]]) of a binomial distribution are given by the formulas | ||
− | + | $$ | |
+ | {\mathsf E} X = np, | ||
+ | $$ | ||
+ | |||
+ | $$ | ||
+ | {\mathsf D} X = {\mathsf E} (X-np) ^ {2} = np (1-p), | ||
+ | $$ | ||
+ | |||
+ | $$ | ||
+ | {\mathsf E}(X-np) ^ {3} = np (1-p) (1 - 2p). | ||
+ | $$ | ||
+ | |||
+ | The binomial distribution function is defined, for any real $ y $, | ||
+ | $ 0 < y < n $, | ||
+ | by the formula | ||
+ | |||
+ | $$ | ||
+ | F (y) = \ | ||
+ | {\mathsf P} \{ X \leq u \} = \ | ||
+ | \sum _ {x = 0 } ^ { [y] } | ||
+ | \left ( \begin{array}{c} | ||
+ | n \\ | ||
+ | x | ||
+ | \end{array} | ||
+ | \right ) p ^ {x} (1 - p) ^ {n - x } , | ||
+ | $$ | ||
− | + | where $ [y] $ | |
+ | is the integer part of $ y $, | ||
+ | and | ||
− | + | $$ | |
+ | F (y) \equiv \ | ||
− | + | \frac{1}{B([y] + 1, n - [y]) } | |
− | + | \int\limits _ { p } ^ { 1 } | |
+ | t ^ {[y]} | ||
+ | (1 - t) ^ {n - [y] - 1 } dt, | ||
+ | $$ | ||
− | + | $ B(a, b) $ | |
+ | is Euler's [[Beta-function|beta-function]], and the integral on the right-hand side is known as the incomplete beta-function. | ||
− | + | As $ n \rightarrow \infty $, | |
+ | the binomial distribution function is expressed in terms of the standard normal distribution function $ \Phi $ | ||
+ | by the asymptotic formula (the de Moivre–Laplace theorem): | ||
− | + | $$ | |
+ | F (y) = \Phi | ||
+ | \left [ | ||
− | + | \frac{y - np + 0.5 }{\sqrt {np (1 - p) } } | |
− | + | \right ] + | |
+ | R _ {n} (y, p), | ||
+ | $$ | ||
where | where | ||
− | + | $$ | |
+ | R _ {n} (y, p) = O (n ^ {-1/2 } ) | ||
+ | $$ | ||
+ | |||
+ | uniformly for all real $ y $. | ||
+ | There also exist other, higher order, normal approximations of the binomial distribution. | ||
+ | |||
+ | If the number of independent trials $ n $ | ||
+ | is large, while the probability $ p $ | ||
+ | is small, the individual probabilities $ b _ {x} (n, p) $ | ||
+ | can be approximately expressed in terms of the [[Poisson distribution|Poisson distribution]]: | ||
+ | |||
+ | $$ | ||
+ | b _ {x} (n, p) = \ | ||
+ | \left ( \begin{array}{c} | ||
+ | n \\ | ||
+ | x | ||
+ | \end{array} | ||
+ | \right ) p ^ {x} | ||
+ | (1 - p) ^ {n - x } \approx \ | ||
+ | |||
+ | \frac{(np) ^ {x} }{x!} | ||
− | + | e ^ {-np } . | |
+ | $$ | ||
− | If | + | If $ n \rightarrow \infty $ |
+ | and $ 0 < c \leq y \leq C $( | ||
+ | where $ c $ | ||
+ | and $ C $ | ||
+ | are constants), the asymptotic formula | ||
− | + | $$ | |
+ | F (y) = \ | ||
+ | \sum _ {x = 0 } ^ { [y] } | ||
− | + | \frac{\lambda ^ {x} }{x!} | |
− | + | e ^ {- \lambda } + O (n ^ {-2} ), | |
+ | $$ | ||
− | where | + | where $ \lambda = (2n - [y])p / (2 - p) $, |
+ | is uniformly valid with respect to all $ p $ | ||
+ | in the interval $ 0 < p < 1 $. | ||
The [[Multinomial distribution|multinomial distribution]] is the multi-dimensional generalization of the binomial distribution. | The [[Multinomial distribution|multinomial distribution]] is the multi-dimensional generalization of the binomial distribution. | ||
====References==== | ====References==== | ||
− | + | {| | |
− | + | |valign="top"|{{Ref|G}}|| B.V. Gnedenko, [[Gnedenko, "A course in the theory of probability"|"The theory of probability"]], Chelsea, reprint (1962) (Translated from Russian) | |
− | + | |- | |
− | + | |valign="top"|{{Ref|F}}|| W. Feller, [[Feller, "An introduction to probability theory and its applications"|"An introduction to probability theory and its applications"]], Wiley (1957–1971) | |
− | + | |- | |
+ | |valign="top"|{{Ref|PR}}|| Yu.V. Prohorov, Yu.A. Rozanov, "Probability theory, basic concepts. Limit theorems, random processes" , Springer (1969) (Translated from Russian) {{MR|0251754}} {{ZBL|}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|P}}|| Yu.V. Prohorov, "Asymptotic behaviour of the binomial distribution" ''Selected Translations in Math. Stat. and Probab.'' , '''1''' , Amer. Math. Soc. (1961) {{MR|0116370}} (Translated from Russian) ''Uspekhi Mat. Nauk'' , '''8''' : 3 (1953) pp. 135–142 {{MR|0056861}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|BS}}|| L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) {{MR|0735434}} {{ZBL|0529.62099}} | ||
+ | |} |
Latest revision as of 10:59, 29 May 2020
Bernoulli distribution
2020 Mathematics Subject Classification: Primary: 60E99 [MSN][ZBL]
The probability distribution of a random variable $ X $ which assumes integral values $ x = 0 \dots n $ with the probabilities
$$ {\mathsf P} \{ X=x \} = b _ {x} (n, p) = \ \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1-p) ^ {n-x} , $$
where $ ( {} _ {x} ^ {n} ) $ is the binomial coefficient, and $ p $ is a parameter of the binomial distribution, called the probability of a positive outcome, which can take values in the interval $ 0 \leq p \leq 1 $. The binomial distribution is one of the fundamental probability distributions connected with a sequence of independent trials. Let $ Y _ {1} , Y _ {2} \dots $ be a sequence of independent random variables, each one of which may assume only one of the values 1 and 0 with respective probabilities $ p $ and $ 1 - p $( i.e. all $ Y _ {i} $ are binomially distributed with $ n = 1 $). The values of $ Y _ {i} $ may be treated as the results of independent trials, with $ Y _ {i} = 1 $ if the result of the $ i $- th trial is "positive" and $ Y _ {i} = 0 $ if it is "negative" . If the total number of independent trials $ n $ is fixed, such a scheme is known as Bernoulli trials, and the total number of positive results,
$$ X=Y _ {1} + \dots + Y _ {n} ,\ \ n \geq 1 , $$
is then binomially distributed with parameter $ p $.
The mathematical expectation $ {\mathsf E} z ^ {X} $( the generating function of the binomial distribution) for any value of $ z $ is the polynomial $ [pz + (1 - p)] ^ {n} $, the representation of which by Newton's binomial series has the form
$$ b _ {0} + b _ {1} z + \dots + b _ {n} z ^ {n} . $$
(Hence the very name "binomial distribution" .) The moments (cf. Moment) of a binomial distribution are given by the formulas
$$ {\mathsf E} X = np, $$
$$ {\mathsf D} X = {\mathsf E} (X-np) ^ {2} = np (1-p), $$
$$ {\mathsf E}(X-np) ^ {3} = np (1-p) (1 - 2p). $$
The binomial distribution function is defined, for any real $ y $, $ 0 < y < n $, by the formula
$$ F (y) = \ {\mathsf P} \{ X \leq u \} = \ \sum _ {x = 0 } ^ { [y] } \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1 - p) ^ {n - x } , $$
where $ [y] $ is the integer part of $ y $, and
$$ F (y) \equiv \ \frac{1}{B([y] + 1, n - [y]) } \int\limits _ { p } ^ { 1 } t ^ {[y]} (1 - t) ^ {n - [y] - 1 } dt, $$
$ B(a, b) $ is Euler's beta-function, and the integral on the right-hand side is known as the incomplete beta-function.
As $ n \rightarrow \infty $, the binomial distribution function is expressed in terms of the standard normal distribution function $ \Phi $ by the asymptotic formula (the de Moivre–Laplace theorem):
$$ F (y) = \Phi \left [ \frac{y - np + 0.5 }{\sqrt {np (1 - p) } } \right ] + R _ {n} (y, p), $$
where
$$ R _ {n} (y, p) = O (n ^ {-1/2 } ) $$
uniformly for all real $ y $. There also exist other, higher order, normal approximations of the binomial distribution.
If the number of independent trials $ n $ is large, while the probability $ p $ is small, the individual probabilities $ b _ {x} (n, p) $ can be approximately expressed in terms of the Poisson distribution:
$$ b _ {x} (n, p) = \ \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1 - p) ^ {n - x } \approx \ \frac{(np) ^ {x} }{x!} e ^ {-np } . $$
If $ n \rightarrow \infty $ and $ 0 < c \leq y \leq C $( where $ c $ and $ C $ are constants), the asymptotic formula
$$ F (y) = \ \sum _ {x = 0 } ^ { [y] } \frac{\lambda ^ {x} }{x!} e ^ {- \lambda } + O (n ^ {-2} ), $$
where $ \lambda = (2n - [y])p / (2 - p) $, is uniformly valid with respect to all $ p $ in the interval $ 0 < p < 1 $.
The multinomial distribution is the multi-dimensional generalization of the binomial distribution.
References
[G] | B.V. Gnedenko, "The theory of probability", Chelsea, reprint (1962) (Translated from Russian) |
[F] | W. Feller, "An introduction to probability theory and its applications", Wiley (1957–1971) |
[PR] | Yu.V. Prohorov, Yu.A. Rozanov, "Probability theory, basic concepts. Limit theorems, random processes" , Springer (1969) (Translated from Russian) MR0251754 |
[P] | Yu.V. Prohorov, "Asymptotic behaviour of the binomial distribution" Selected Translations in Math. Stat. and Probab. , 1 , Amer. Math. Soc. (1961) MR0116370 (Translated from Russian) Uspekhi Mat. Nauk , 8 : 3 (1953) pp. 135–142 MR0056861 |
[BS] | L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) MR0735434 Zbl 0529.62099 |
Binomial distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Binomial_distribution&oldid=25816