Pólya distribution

The probability distribution of a random variable $X _ {n}$ taking non-negative integer values $k$, $0 \leq k \leq n$, given by the formula

$${\mathsf P} \{ X _ {n} = k \} =$$

$$= \ \left ( \begin{array}{c} n \\ k \end{array} \right ) \frac{( b; c) _ {k-} 1 ( r; c) _ {n-} k- 1 }{( b+ r; c) _ {n-} 1 } ,$$

where $( b; c) _ {k-} 1 = b( b+ c) \dots [ b+( k- 1) c]$ and the integers $n > 0$, $b > 0$, $r > 0$, and $c \geq - 1$ are parameters; or, when $c > 0$, give by the equivalent formula

$$\tag{2 } {\mathsf P} \{ X _ {n} = k \} =$$

$$= \ \left ( \begin{array}{c} n \\ k \end{array} \right ) \frac{( p; \gamma ) _ {k-} 1 ( q; \gamma ) _ {n-} k- 1 }{( 1; \gamma ) _ {n-} 1 } =$$

$$= \ \frac{\left ( \begin{array}{c} ( p/ \gamma )+ k- 1 \\ k \end{array} \right ) \left ( \begin{array}{c} ( q/ \gamma )+ n- k- 1 \\ n- k \end{array} \right ) }{ \left ( \begin{array}{c} ( 1/ \gamma )+ n- 1 \\ n \end{array} \right ) } ,$$

where the integer $n > 0$, the real numbers $0 < p < 1$, $q = 1- p$, and $\gamma > 0$ are parameters. The relation between

and (2) is given by

$$p = \frac{b}{b+} r ,\ \ q = \frac{r}{b+} r ,\ \ \gamma = \frac{c}{b+} r .$$

The mathematical expectation and variance of the Pólya distribution are ${\mathsf E} X _ {n} = np$ and ${\mathsf D} X _ {n} = npq( 1 + \gamma n)/( 1 + \gamma )$, respectively. Special cases of the Pólya distribution are: $X _ {n}$ has a binomial distribution with parameters $n$ and $p$ for $c = 0$; $X _ {n}$ has a hypergeometric distribution for $s = - 1$ with parameters $M = b$, $N = b+ r$ and $n$. For $b \rightarrow \infty$ and $r \rightarrow \infty$ such that $p = b/( b+ r)$ is constant and $\gamma = c/( b+ r) \rightarrow 0$, the distribution tends to the binomial distribution with parameters $n$ and $p$.

The distribution was considered by G. Pólya (1923) in connection with the so-called Pólya urn scheme. From an urn containing $b$ black and $r$ red balls one makes a selection with replacement, subject to the condition that each extracted ball is returned to the urn together with $c$ balls of the same colour. If $X _ {n}$ is the total number of black balls at the $n$- th trial, the distribution of $X _ {n}$ is given by

or (2). The sequence $X _ {n}$, $n = 1, 2 \dots$ is a discrete Markov process, where the states are defined by the numbers of black balls in the sample at time $n$, while the conditional probability of transition from a state $k$ at time $n$ to a state $k+ 1$ at time $n+ 1$ equals

$${\mathsf P} \{ X _ {n+} 1 = k+ 1 \mid X _ {k} = k \} = \ \frac{b + kc }{b + r + nc } = \frac{p + k \gamma }{1 + n \gamma }$$

(this depends on $n$).

By passing to the limit from the Pólya urn scheme one obtains the Pólya process, which is an inhomogeneous Markov process in continuous time and belongs to the class of "pure multiplication" processes. Under the condition that there is only one extraction of a ball in an infinitesimal time $\Delta t$, one obtains the conditional limit probability of transition from the state $k$ to state the $k+ 1$ during the time $\Delta t$ for $n \rightarrow \infty$ when $np \rightarrow t$ and $n \gamma \rightarrow \alpha t$, as

$${\mathsf P} \{ X ( t+ \Delta t) = k + 1 \mid X( t) = k \} = \ \frac{1 + \alpha k }{1 + \alpha t } \Delta t + o( \Delta t).$$

On transition from the Pólya urn scheme to the Pólya process one obtains an important limit form for the Pólya distribution. That is, the probability ${\mathsf P} _ {k} ( t)$ of remaining in the state $k$ at time $t$ is

$${\mathsf P} _ {n} ( t) = \ \left ( \begin{array}{c} {( 1 / \alpha ) + n- 1 } \\ n \end{array} \right ) \left ( \frac{\alpha t }{1 + \alpha t } \right ) ^ {n} \left ( \frac{1}{1 + \alpha t } \right ) ^ {1/ \alpha }$$

$$( {\mathsf P} _ {0} ( 0) = 1).$$

This limit distribution is the negative binomial distribution with parameters $1/ \alpha$ and $1/( 1+ \alpha t)$( the corresponding mathematical expectation is $t$, while the variance is $t( 1 + \alpha t)$).

The urn model and the Pólya process, in which the Pólya distribution and the limit form of it arise, are models with an after effect (extracting a ball of a particular colour from the urn increases the probability of extracting a ball of the same colour in a subsequent trial).

As $\alpha$ tends to zero, the Pólya process goes over to a Poisson process, while the Pólya distribution for $\alpha \rightarrow 0$ has as limit the Poisson distribution with parameter $t$.

References

 [1] W. Feller, "An introduction to probability theory and its applications", 1–2, Wiley (1968)