Difference between revisions of "Binomial distribution"

Latest revision as of 10:59, 29 May 2020

Bernoulli distribution

2020 Mathematics Subject Classification: Primary: 60E99 [MSN][ZBL]

The probability distribution of a random variable $ X $ which assumes integral values $ x = 0 \dots n $ with the probabilities

$$ {\mathsf P} \{ X=x \} = b _ {x} (n, p) = \ \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1-p) ^ {n-x} , $$

where $ ( {} _ {x} ^ {n} ) $ is the binomial coefficient, and $ p $ is a parameter of the binomial distribution, called the probability of a positive outcome, which can take values in the interval $ 0 \leq p \leq 1 $. The binomial distribution is one of the fundamental probability distributions connected with a sequence of independent trials. Let $ Y _ {1} , Y _ {2} \dots $ be a sequence of independent random variables, each one of which may assume only one of the values 1 and 0 with respective probabilities $ p $ and $ 1 - p $( i.e. all $ Y _ {i} $ are binomially distributed with $ n = 1 $). The values of $ Y _ {i} $ may be treated as the results of independent trials, with $ Y _ {i} = 1 $ if the result of the $ i $- th trial is "positive" and $ Y _ {i} = 0 $ if it is "negative" . If the total number of independent trials $ n $ is fixed, such a scheme is known as Bernoulli trials, and the total number of positive results,

$$ X=Y _ {1} + \dots + Y _ {n} ,\ \ n \geq 1 , $$

is then binomially distributed with parameter $ p $.

The mathematical expectation $ {\mathsf E} z ^ {X} $( the generating function of the binomial distribution) for any value of $ z $ is the polynomial $ [pz + (1 - p)] ^ {n} $, the representation of which by Newton's binomial series has the form

$$ b _ {0} + b _ {1} z + \dots + b _ {n} z ^ {n} . $$

(Hence the very name "binomial distribution" .) The moments (cf. Moment) of a binomial distribution are given by the formulas

$$ {\mathsf E} X = np, $$

$$ {\mathsf D} X = {\mathsf E} (X-np) ^ {2} = np (1-p), $$

$$ {\mathsf E}(X-np) ^ {3} = np (1-p) (1 - 2p). $$

The binomial distribution function is defined, for any real $ y $, $ 0 < y < n $, by the formula

$$ F (y) = \ {\mathsf P} \{ X \leq u \} = \ \sum _ {x = 0 } ^ { [y] } \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1 - p) ^ {n - x } , $$

where $ [y] $ is the integer part of $ y $, and

$$ F (y) \equiv \ \frac{1}{B([y] + 1, n - [y]) } \int\limits _ { p } ^ { 1 } t ^ {[y]} (1 - t) ^ {n - [y] - 1 } dt, $$

$ B(a, b) $ is Euler's beta-function, and the integral on the right-hand side is known as the incomplete beta-function.

As $ n \rightarrow \infty $, the binomial distribution function is expressed in terms of the standard normal distribution function $ \Phi $ by the asymptotic formula (the de Moivre–Laplace theorem):

$$ F (y) = \Phi \left [ \frac{y - np + 0.5 }{\sqrt {np (1 - p) } } \right ] + R _ {n} (y, p), $$

where

$$ R _ {n} (y, p) = O (n ^ {-1/2 } ) $$

uniformly for all real $ y $. There also exist other, higher order, normal approximations of the binomial distribution.

If the number of independent trials $ n $ is large, while the probability $ p $ is small, the individual probabilities $ b _ {x} (n, p) $ can be approximately expressed in terms of the Poisson distribution:

$$ b _ {x} (n, p) = \ \left ( \begin{array}{c} n \\ x \end{array} \right ) p ^ {x} (1 - p) ^ {n - x } \approx \ \frac{(np) ^ {x} }{x!} e ^ {-np } . $$

If $ n \rightarrow \infty $ and $ 0 < c \leq y \leq C $( where $ c $ and $ C $ are constants), the asymptotic formula

$$ F (y) = \ \sum _ {x = 0 } ^ { [y] } \frac{\lambda ^ {x} }{x!} e ^ {- \lambda } + O (n ^ {-2} ), $$

where $ \lambda = (2n - [y])p / (2 - p) $, is uniformly valid with respect to all $ p $ in the interval $ 0 < p < 1 $.

The multinomial distribution is the multi-dimensional generalization of the binomial distribution.

References

[G]	B.V. Gnedenko, "The theory of probability", Chelsea, reprint (1962) (Translated from Russian)
[F]	W. Feller, "An introduction to probability theory and its applications", Wiley (1957–1971)
[PR]	Yu.V. Prohorov, Yu.A. Rozanov, "Probability theory, basic concepts. Limit theorems, random processes" , Springer (1969) (Translated from Russian) MR0251754
[P]	Yu.V. Prohorov, "Asymptotic behaviour of the binomial distribution" Selected Translations in Math. Stat. and Probab. , 1 , Amer. Math. Soc. (1961) MR0116370 (Translated from Russian) Uspekhi Mat. Nauk , 8 : 3 (1953) pp. 135–142 MR0056861
[BS]	L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) MR0735434 Zbl 0529.62099

How to Cite This Entry:
Binomial distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Binomial_distribution&oldid=46067

This article was adapted from an original article by L.N. Bol'shev (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Binomial distribution"

Latest revision as of 10:59, 29 May 2020

References

@@ Line 1: / Line 1: @@
+<!--
+b0164201.png
+$#A+1 = 49 n = 0
+$#C+1 = 49 : ~/encyclopedia/old_files/data/B016/B.0106420 Binomial distribution,
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
+{{TEX|auto}}
+{{TEX|done}}
 ''Bernoulli distribution''
@@ Line 5: / Line 17: @@
 [[Category:Distribution theory]]
-The probability distribution of a random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164201.png" /> which assumes integral values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164202.png" /> with the probabilities
+The probability distribution of a random variable  $  X $
+which assumes integral values  $  x = 0 \dots n $
+with the probabilities
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164203.png" /></td> </tr></table>
+$$
+{\mathsf P} \{ X=x \}  =  b _ {x} (n, p)  = \
+\left ( \begin{array}{c}
+n \\
+ x
+\end{array}
+ \right ) p  ^ {x} (1-p)  ^ {n-x} ,
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164204.png" /> is the binomial coefficient, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164205.png" /> is a parameter of the binomial distribution, called the probability of a positive outcome, which can take values in the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164206.png" />. The binomial distribution is one of the fundamental probability distributions connected with a sequence of independent trials. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164207.png" /> be a sequence of independent random variables, each one of which may assume only one of the values 1 and 0 with respective probabilities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164208.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b0164209.png" /> (i.e. all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642010.png" /> are binomially distributed with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642011.png" />). The values of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642012.png" /> may be treated as the results of independent trials, with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642013.png" /> if the result of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642014.png" />-th trial is "positive" and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642015.png" /> if it is "negative" . If the total number of independent trials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642016.png" /> is fixed, such a scheme is known as [[Bernoulli trials|Bernoulli trials]], and the total number of positive results,
+where  $  ( {} _ {x}  ^ {n} ) $
+is the binomial coefficient, and  $  p $
+is a parameter of the binomial distribution, called the probability of a positive outcome, which can take values in the interval  $  0 \leq  p \leq  1 $.
+The binomial distribution is one of the fundamental probability distributions connected with a sequence of independent trials. Let  $  Y _ {1} , Y _ {2} \dots $
+be a sequence of independent random variables, each one of which may assume only one of the values 1 and 0 with respective probabilities  $  p $
+and  $  1 - p $(
+i.e. all  $  Y _ {i} $
+are binomially distributed with  $  n = 1 $).
+The values of  $  Y _ {i} $
+may be treated as the results of independent trials, with  $  Y _ {i} = 1 $
+if the result of the  $  i $-
+th trial is "positive" and  $  Y _ {i} = 0 $
+if it is "negative" . If the total number of independent trials  $  n $
+is fixed, such a scheme is known as [[Bernoulli trials|Bernoulli trials]], and the total number of positive results,
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642017.png" /></td> </tr></table>
+$$
+X=Y _ {1} + \dots +
+Y _ {n} ,\ \
+n \geq  1 ,
+$$
-is then binomially distributed with parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642018.png" />.
+is then binomially distributed with parameter  $  p $.
-The mathematical expectation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642019.png" /> (the [[Generating function|generating function]] of the binomial distribution) for any value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642020.png" /> is the polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642021.png" />, the representation of which by Newton's binomial series has the form
+The mathematical expectation  $  {\mathsf E} z  ^ {X} $(
+the [[Generating function|generating function]] of the binomial distribution) for any value of  $  z $
+is the polynomial  $  [pz + (1 - p)]  ^ {n} $,
+the representation of which by Newton's binomial series has the form
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642022.png" /></td> </tr></table>
+$$
+b _ {0} + b _ {1} z + \dots + b _ {n} z  ^ {n} .
+$$
 (Hence the very name "binomial distribution" .) The moments (cf. [[Moment|Moment]]) of a binomial distribution are given by the formulas
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642023.png" /></td> </tr></table>
+$$
+{\mathsf E} X  =  np,
+$$
+$$
+{\mathsf D} X  =  {\mathsf E} (X-np)  ^ {2}  =  np (1-p),
+$$
+$$
+{\mathsf E}(X-np)  ^ {3}  =  np (1-p) (1 - 2p).
+$$
+The binomial distribution function is defined, for any real  $  y $,
+$  0 < y < n $,
+by the formula
+$$
+F (y)  = \
+{\mathsf P} \{ X \leq  u \}  = \
+\sum _ {x = 0 } ^ { [y] }
+\left ( \begin{array}{c}
+n \\
+ x
+\end{array}
+ \right ) p  ^ {x} (1 - p) ^ {n - x } ,
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642024.png" /></td> </tr></table>
+where  $  [y] $
+is the integer part of  $  y $,
+and
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642025.png" /></td> </tr></table>
+$$
+F (y)  \equiv \
-The binomial distribution function is defined, for any real <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642026.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642027.png" />, by the formula
+\frac{1}{B([y] + 1, n - [y]) }
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642028.png" /></td> </tr></table>
+\int\limits _ { p } ^ { 1 }
+t  ^ {[y]}
+(1 - t) ^ {n - [y] - 1 }  dt,
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642029.png" /> is the integer part of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642030.png" />, and
+$  B(a, b) $
+is Euler's [[Beta-function|beta-function]], and the integral on the right-hand side is known as the incomplete beta-function.
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642031.png" /></td> </tr></table>
+As  $  n \rightarrow \infty $,
+the binomial distribution function is expressed in terms of the standard normal distribution function  $  \Phi $
+by the asymptotic formula (the de Moivre–Laplace theorem):
-<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642032.png" /> is Euler's [[Beta-function|beta-function]], and the integral on the right-hand side is known as the incomplete beta-function.
+$$
+F (y)  =  \Phi
+\left [
-As <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642033.png" />, the binomial distribution function is expressed in terms of the standard normal distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642034.png" /> by the asymptotic formula (the de Moivre–Laplace theorem):
+\frac{y - np + 0.5 }{\sqrt {np (1 - p) } }
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642035.png" /></td> </tr></table>
+\right ] +
+R _ {n} (y, p),
+$$
 where
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642036.png" /></td> </tr></table>
+$$
+R _ {n} (y, p)  =  O (n ^ {-1/2 } )
+$$
+uniformly for all real  $  y $.
+There also exist other, higher order, normal approximations of the binomial distribution.
+If the number of independent trials  $  n $
+is large, while the probability  $  p $
+is small, the individual probabilities  $  b _ {x} (n, p) $
+can be approximately expressed in terms of the [[Poisson distribution|Poisson distribution]]:
+$$
+b _ {x} (n, p)  = \
+\left ( \begin{array}{c}
+n \\
+ x
+\end{array}
+ \right ) p  ^ {x}
+(1 - p) ^ {n - x }  \approx \
+\frac{(np)  ^ {x} }{x!}
-uniformly for all real <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642037.png" />. There also exist other, higher order, normal approximations of the binomial distribution.
+e ^ {-np } .
+$$
-If the number of independent trials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642038.png" /> is large, while the probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642039.png" /> is small, the individual probabilities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642040.png" /> can be approximately expressed in terms of the [[Poisson distribution|Poisson distribution]]:
+If  $  n \rightarrow \infty $
+and  $  0 < c \leq  y \leq  C $(
+where  $  c $
+and  $  C $
+are constants), the asymptotic formula
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642041.png" /></td> </tr></table>
+$$
+F (y)  = \
+\sum _ {x = 0 } ^ { [y] }
-If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642042.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642043.png" /> (where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642044.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642045.png" /> are constants), the asymptotic formula
+\frac{\lambda  ^ {x} }{x!}
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642046.png" /></td> </tr></table>
+e ^ {- \lambda } + O (n  ^ {-2} ),
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642047.png" />, is uniformly valid with respect to all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642048.png" /> in the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/b/b016/b016420/b01642049.png" />.
+where  $  \lambda = (2n - [y])p / (2 - p) $,
+is uniformly valid with respect to all  $  p $
+in the interval  $  0 < p < 1 $.
 The [[Multinomial distribution|multinomial distribution]] is the multi-dimensional generalization of the binomial distribution.