Generating random variables
Copyright notice |
---|
This article Generating Random Variables was adapted from an original article by Christian P. Robert, George Casella, which appeared in StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies. The original article ([http://statprob.com/encyclopedia/GeneratingRandomVariables.html StatProb Source], Local Files: pdf | tex) is copyrighted by the author(s), the article has been donated to Encyclopedia of Mathematics, and its further issues are under Creative Commons Attribution Share-Alike License'. All pages from StatProb are contained in the Category StatProb. |
2020 Mathematics Subject Classification: Primary: 60-XX Secondary: 62-XX62E99 [MSN][ZBL]
$ \newcommand{\etal}{ ''et al.$\;$''} $
Université Paris Dauphine and CREST, INSEE
and
George Casella [2]
University of Florida
Keywords and Phrases: Random Number Generator, Probability Integral Transform, Accept Reject Algorithm, Metropolis-Hastings Algorithm, Gibbs Sampling, Markov Chains
Introduction
\label{sec:intro} The methods described in this article mostly rely on the possibility of producing (with a computer) a supposedly endless flow of random variables (usually iid) for well-known distributions. Such a simulation is, in turn, based on the production of uniform random variables. There are many ways in which uniform pseudorandom numbers can be generated. For example there is the Kiss algorithm of Marsaglia and Zaman (1993); details on other random number generators can be found in the books of Rubinstein (1981), Ripley (1987), Fishman (1996), and Knuth (1998).
Generating Nonuniform Random Variables
\label{sec:RN.NUN} The generation of random variables that are uniform on the interval $[0,1]$, the Uniform ${[0,1]}$ distribution, provides the basic probabilistic representation of randomness. The book by Devroye (1986) is a detailed discussion of methods for generating nonuniform variates, and the subject is one of the many covered in Knuth (1998). Formally, we can generate random variables with any distribution by means of the following.
Probability Integral Transform
\label{sec:PIT} For a function $F$ on $\Re$, the generalized inverse of $F$ is $F^{-}(u) = \inf \; \{x:\; F(x) \geq u\} \;$. If $F$ is a cumulative distribution function (cdf), $F^{-}(U)$ has the cdf $F$ and to generate $X \sim F$,
- Generate $U$ according to Uniform${[0,1]}$
- Make the transformation $x=F^-(u)$.
Generating Discrete Variables
\label{sec:RN.NUN.DSCR}
For discrete distributions, if the random variable $X$ satisfies $$ P(X=k)=p_k, \quad k=0,1,2,\ldots $$ then if $U \sim$ uniform${[0,1]}$, $X$ can be generated by \begin{eqnarray*} k=0:&& U \le p_0 \Rightarrow X=0,\nonumber \\ k=1, 2, \ldots,:&&\sum_{j=0}^{k-1} p_j \le U < \sum_{j=0}^{k}p_{j} \Rightarrow X=k. \end{eqnarray*}
Mixture Distributions
In a mixture representation a density $f$ has the form \begin{eqnarray*} f(x) &=& \int_{\cal{Y}} \; f(x|y) g(y)\; dy \quad \mbox{ (continuous) }\\ f(x) &=& \sum_{i \in {\cal Y}} \; p_{i} \; f_{i}(x)\; \quad \mbox{ (discrete). } \end{eqnarray*} We then can simulate $X$ as
$ \renewcommand{\theenumi}{\arabic{enumi}} $
- $Y \sim g(y)$, $X \sim f(x|Y=y)$, or
- $P(Y=i)=p_i$, $X \sim f_i(x)$
Accept-Reject Methods
\label{sec:ar}
Another class of methods only requires the form of the density $f$ of interest - called the target density\/. We simulate from a density $g$, called the candidate density\/.
Given a target density $f$, we need a candidate density $g$ and a constant $M$ such that $$ f(x) \leq M g(x) $$ on the support of $f$.
Accept-Reject Algorithm:
\label{sec:ARalg} To produce a variable $X$ distributed according to $f$:
- Generate $Y \sim g$, $U\sim $ Uniform${[0,1]}$ ;
- Accept $X = Y$ if $U \leq f(Y) / Mg(Y)$ ;
- Otherwise, return to 1.
Notes:
$ \renewcommand{\theenumi}{(\alph{enumi})} $
- The densities $f$ and $g$ need be known only up to a multiplicative factor.
- The probability of acceptance is $1/M$, when evaluated for the normalized densities.
Envelope Accept-Reject Algorithm:
If the target density $f$ is difficult to evaluate, the Envelope Accept-Reject Algorithm (called the squeeze principle\/ by Marsaglia 1977) may be appropriate.
If there exist a density $g_{m}$, a function $g_{l}$ and a constant $M$ such that $$ g_{l}(x) \leq f(x) \leq Mg_{m}(x) \;, $$ then the algorithm
- Generate $X \sim g_{m}(x)$, $\; U \sim $ Uniform${[0,1]}$;
- Accept $X$ if $U \leq g_{l}(X) / M g_{m}(X)$;
- Otherwise, accept $X$ if $U \leq f(X) /M g_{m}(X)$
- Otherwise, return to 1.
produces $X \sim f$. The number of evaluations of $f$ is potentially decreased by a factor ${1\over M}$. If $f$ is log-concave, Gilks and Wild (1992) construct a generic accept-reject algorithm that can be quite efficient.
Markov Chain Methods
Every simulation method discussed thus far has produced independent random variables whose distribution is exactly the target distribution. In contrast, Markov chain methods produce a sequence of dependent random variables whose distribution converges to the target. Their advantage is their applicability in complex situations.
Recall that a sequence $X_0, X_1, \ldots, X_n, \ldots$ of random variables is a Markov chain if, for any $t$, the conditional distribution of $X_t$ given $x_{t-1},x_{t-2},\ldots, x_0$ is the same as the distribution of $X_t$ given $x_{t-1}$; that is, $$ P(X_{k+1} \in A |x_0, x_1, x_2, \ldots, x_k)= P(X_{k+1} \in A |x_k). $$
Metropolis - Hastings Algorithm
\label{sec:met}
The Metropolis-Hastings (M-H) algorithm (Metropolis et al. 1953, Hastings 1970) associated with the target density $f$ and the candidate density $q(y|x)$ produces a Markov chain $(X^{(t)})$ through
Metropolis -Hastings
Given $x^{(t)}$,
- Generate $\; Y_t \sim q(y|x^{(t)})$.
- Take
$$ X^{(t+1)} =\left\{ \begin{array}{ll} Y_t & \mbox{ with prob. } \rho(x^{(t)},Y_t) \\ x^{(t)} & \mbox{ with prob. } 1 - \rho(x^{(t)},Y_t)\end{array}\right. $$ where $$ \rho(x,y) = \min \left\{ {f(y)\over f(x)} \; {q(x|y)\over q(y|x)} \;, 1 \right\} \;, $$
- Then
$X^{(t)}$ converges in distribution to $X \sim f$.
Notes:
$ \renewcommand{\theenumi}{(\alph{enumi})} $
- For $q(\cdot|x)=q(\cdot)$ we have the independent M-H algorithm, and for $q(x|y) = q(y|x)$ we have a symmetric M-H algorithm, where $\rho$ does not depend on $q$. Also, $q(x|y) = q(y-x)$, symmetric around zero, is a random walk M-H algorithm.
- Like the Accept-Reject method, the Metropolis - Hastings algorithm only requires knowing $f$ and $q$ up to normalizing constants.
The Gibbs Sampler
\label{sec:gibbs}
For $p>1$, write the random variable ${\mathbf X} \in {\cal X}$ as ${\mathbf X}=(X_1,\ldots,X_p) \sim f$, where the $X_i$'s are either uni- or multidimensional, with conditional distributions $$ X_i|x_1, x_2, \ldots, x_{i-1}, x_{i+1}, \ldots,x_p \sim f_i(x_i|x_1, x_2, \ldots, x_{i-1}, x_{i+1}, \ldots,x_p) $$ for $i=1,2,\ldots,p$. The associated Gibbs sampler is given by the following transition from $X^{(t)}$ to $X^{(t+1)}$:
The Gibbs sampler
Given ${\mathbf x}^{(t)} = (x_1^{(t)},\ldots,x_p^{(t)})$, generate
- $X_1^{(t+1)} \sim f_1(x_1|x_2^{(t)},\ldots,x_p^{(t)})$;
- $X_2^{(t+1)} \sim f_2(x_2|x_1^{(t+1)},x_3^{(t)},\ldots,x_p^{(t)})$,
- $\qquad\vdots$
- $X_p^{(t+1)} \sim f_p(x_p|x_1^{(t+1)},\ldots,x_{p-1}^{(t+1)})$,
then $X^{(t)}$ converges in distribution to $X \sim f$.
Notes:
$ \renewcommand{\theenumi}{(\alph{enumi})} $
- The densities $f_1,\ldots,f_p$ are called the full univariate conditionals.
- Even in a high dimensional problem, all of the simulations can be univariate.
- The Gibbs sampler is, formally, a special case of the M-H algorithm (see Robert and Casella 2004, Section $10.2.2$)
but with acceptance rate equal to $1$.
The Slice Sampler
A particular version of the Gibbs sampler, called the slice sampler (Besag and Green 1993, Damien et al. 1999), can sometimes be useful. Write $f(x)= \prod_{i=1}^k f_i(x)$ where the $f_i$'s are positive functions, not necessarily densities
Slice sampler
Simulate
- $\omega_1^{(t+1)} \sim \mbox{ Uniform}_{[0,f_1(x^{(t)})]}$;
- $\qquad\ldots$
- $\omega_k^{(t+1)} \sim \mbox{ Uniform}_{[0,f_k(x^{(t)})]}$;
- $x^{(t+1)} \sim \mbox{ Uniform}_{A^{(t+1)}}$, with
$$ A^{(t+1)} = \{ y;\; f_i(y) \ge \omega_i^{(t+1)}, i=1,\ldots,k \}. $$
then $X^{(t)}$ converges in distribution to $X \sim f$.
Application
In this section we give some guidelines for simulating different distributions. See Robert and Casella (2004, 2010) for more detailed explanations, and Devroye (1986) for more algorithms. In what follows, $U$ is Uniform$(0,1)$ unless otherwise specified.
Arcsine
$ f(x ) = \frac{1}{\pi\sqrt{x(1-x)}},\quad 0 \leq x \leq 1, \quad \sin^2\left(\pi U/2 \right) \sim f$.
Beta$(r,s )$
$f(x) = {\Gamma(r+s) \over{\Gamma(r)\Gamma(s )}}x^{r-1}(1-x)^{s -1}$, $0 \leq x \leq 1, r > 0, s > 0$, $\frac{X_1}{X_1+X_2} \sim f$, where $X_1 \sim$ Gamma $(r,1)$ and $X_2 \sim$ Gamma $(s,1)$, independent.
Cauchy$(\mu,\sigma )$
$f(x\vert \mu,\sigma ) = {1\over{\sigma \pi }}{1\over{1+\frac{(x-\mu)^2}{\sigma^2}}}$, $-\infty < x < \infty $, $\sigma \tan\left(\frac{\pi}{2}(2U-1) \right) + \mu \sim \mbox{ Cauchy } (\mu,\sigma )$.
If $U \sim$ uniform${[-\pi/2,\pi/2]}$, then $\tan(U) \sim $ Cauchy$(0,1)$. Also, $X/Y \sim \mbox{ Cauchy }(0,1)$, where $X, Y, \sim N(0,1)$, independent.
Chi squared($p$)
$f(x\vert p) = {1\over{\Gamma (p/2)2^{p/2}}}x^{(p/2)-1}e^{-x/2}$, $0 \leq x < \infty, p = 1, 2, \dots$. $$ - 2 \sum_{j=1}^\nu \log(U_j ) \sim \chi^2_ {2 \nu}, \mbox{ or } \chi^2_ {2 \nu} + Z^2 \sim \chi^2_ {2 \nu+1}, $$ where $Z$ is an independent Normal$(0,1)$ random variable.
Double exponential$(\mu,\sigma )$
$f(x\vert \mu,\sigma ) = {1\over{2\beta }}e^{-{\left|{x-\mu }\right|}/\beta }$, $-\infty < x < \infty$, $-\infty < \mu < \infty$, $\beta > 0$. If $Y \sim \mbox{Exponential} (\beta )$, then attaching a random sign ($+$ if $U>.5$, $-$ otherwise) gives $X \sim \mbox{Double exponential}(0,1)$, and $\sigma X +\mu \sim \mbox{Double exponential}(\mu,\sigma)$. This is also known as the Laplace distribution.
Exponential$(\beta )$
$f(x\vert \beta ) = {1\over{\beta }}e^{-x/\beta },\quad 0 \leq x < \infty,\quad \beta > 0$, $-\beta \log U \sim \mbox{ Exponential}(\beta)$.
Extreme Value$(\alpha, \gamma )$
$f(x\vert \alpha, \gamma ) = e^{-\frac{x-\alpha}{\gamma}-e^{-\frac{x-\alpha}{\gamma}}},\quad \alpha \leq x < \infty, \quad \gamma >0$. If $X \sim \mbox{ Exponential}(1)$, $\alpha - \gamma \log (X) \sim \mbox{Extreme Value}(\alpha, \gamma )$. This is also known as the Gumbel distribution.
F Distribution
$f(x\vert {\nu }_1,\nu_2) = { {\Gamma \left( { {{\nu }_1+\nu_2} \over 2}\right) }\over{ \Gamma \left({ {\nu }_1 \over 2}\right) \Gamma \left({\nu_2 \over 2}\right)}}\left( { {{\nu }_1} \over{\nu_2 }}\right) ^{ {\nu}_1/2} { {x^{({\nu }_1-2)/2}} \over \left(1 + \left( { {\nu }_1 \over \nu_2 } \right)x\right) ^{({\nu }_1+{ \nu }_2)/2}}$, $ 0 \leq x < \infty.$ If $X_1 \sim {\chi }^2_{ {\nu }_1}$ and $X_2 \sim {\chi }^2_{ {\nu }_2}$, independent, $\frac{X_1/\nu_1}{X_2/\nu_2} \sim f(x\vert {\nu }_1,\nu_2).$
Gamma$(\alpha,\beta )$
$f(x\vert \alpha,\beta ) = {1\over{\Gamma (\alpha ){\beta }^ {\alpha }}}x^{\alpha -1}e^{-x/\beta }$, $0 \leq x < \infty$, $\alpha, \beta > 0$. Then $-\beta \sum^a_{j=1} \log(U_j ) \sim \mbox{Gamma}(a, \beta)$, $a \mbox{ an integer}.$
If $\alpha$ is not an integer, indirect methods can be used. For example, to generate a Gamma$(\alpha,\beta )$ use Algorithm \ref{sec:ARalg} or \ref{sec:met} with candidate distribution Gamma$(a,b)$, with $a = [\alpha]$ and $b=\beta \alpha/a$, where $[\alpha]$ is the greatest integer less than $\alpha$. For the Accept-Reject algorithm the bound on the normalized $f/g$ is $M = \frac{\Gamma(a)}{\Gamma(\alpha)}\frac{\alpha^\alpha}{a^a}e^{-(\alpha-a)}$. There are many other efficient algorithms.
Note:
If $X \sim \mbox{Gamma}(\alpha,1 )$ then $\beta X \sim \mbox{Gamma}(\alpha,\beta )$. Some special cases are Exponential $(1)$ = gamma$(1,1)$, and Chi squared $(p)$= gamma $(p/2, 2)$. Also, $ 1/X$ has the inverted (or inverse) gamma distribution.
Logistic$(\mu,\beta )$
$f(x\vert \mu,\beta ) = {1\over{\beta }} { {e^{-(x-\mu )/\beta }}\over{ {{\left[{1 + e^{-(x-\mu )/\beta }}\right]}}^2}}$, $-\infty < x < \infty$, $-\infty < \mu < \infty$, $\beta >0$. $$ -\beta \log \left(\frac{1-U}{U}\right) + \mu \sim \mbox{Logistic}(\mu,\beta ) $$
Lognormal$(\mu,\sigma^2)$
$f(x\vert \mu,\sigma^2) = {1\over{\sqrt{2\pi }\sigma }} { {e^{-(\log x-\mu )^2/(2\sigma^2)}}\over x}$, $0 \leq x < \infty$, $ -\infty < \mu < \infty $. If $ X \sim \mbox{Normal}(\mu,\sigma^2)$. $$ e^X \sim \mbox{ Lognormal}(\mu,{\sigma }^2) $$
Noncentral chi squared $(\lambda,p)$, $\lambda \ge 0$
\label{sec:ncChi}$f_p(x|\lambda)= \sum_{k=0}^\infty \frac{x^{p/2+k-1}e^{-x/2}}{\Gamma(p/2+k)2^{p/2+k}}\frac{\lambda^k e^{-\lambda}}{k!}$, $ 0<x<\infty$. $$ K \sim \mbox{ Poisson }(\lambda/2), \quad X \sim \chi_{p+2K}^{2} \Rightarrow X \sim f_p(x|\lambda) $$ where $p$ is the degrees of freedom and $\lambda$ is the noncentrality parameter. A more efficient algorithm is $$ Z\sim\chi_{p-1}^{2} \mbox{ and } Y\sim \mbox{N }(\sqrt{\lambda}, 1) \Rightarrow Z+Y^2\sim f_p(x|\lambda). $$
Normal$(\mu,\sigma^2)$
\label{sec:normal}$f(x\vert \mu,{\sigma }^2) = {1\over{\sqrt{2\pi }\sigma }}e^{-(x-\mu )^2/(2{\sigma }^2)}$, $-\infty < x < \infty$, $-\infty < \mu < \infty $, $\sigma > 0$. The Box-Muller algorithm simulates two normals from two uniforms: $$ X_{1} = \sqrt{- 2 \log (U_{1}}) \; \cos (2\pi U_{2}) \mbox{ and } X_{2} = \sqrt{- 2 \log (U_{1}}) \; \sin (2\pi U_{2}) \;, $$ then $X_1, X_2 \sim \mbox{ Normal}(\mu,{\sigma }^2)$.
There are many other ways to generate normal random variables.
$ \renewcommand{\theenumi}{(\alph{enumi})} $
- Accept Reject using Cauchy \
When $f(x) = (1/\sqrt{2\pi})\exp (-x^{2}/2)$ and $g(x) = (1/\pi)1/(1 + x^{2})$, densities of the normal and Cauchy distributions, respectively, then $f(x)/g(x) \le \sqrt{2\pi/e} = 1.520$.
- Accept Reject using double exponential When $f(x) = \frac{1}{\sqrt{2\pi}}\exp (-x^{2}/2)$ and
$g(x) = (1/2) \exp(-|x|)$, $f(x)/g(x) \le \sqrt{2 e/\pi}=1.315$.
- Slice Sampler
$$ W|x \sim \mbox{ uniform } {[0,\exp(-x^2/2)]}\;, \qquad X|w \sim \mbox{ uniform } {[-\sqrt{-2\log(w)},\sqrt{-2\log(w)}]} \;, $$ yields $X \sim N(0,1)$.
Pareto$(\alpha,\beta )$, $\alpha > 0,\quad \beta > 0$
$f(x\vert \alpha, \beta ) = { {\beta {\alpha }^{\beta }}\over{x^{\beta +1}}}$, $ \alpha < x < \infty$, $$ \frac{\alpha}{(1-U)^{1/\beta}} \sim \mbox{ Pareto }(\alpha,\beta) $$
Student's $t_\nu$
$f(x\vert \nu ) = { {\Gamma \left( { {\nu +1}\over 2}\right) } \over{\Gamma \left({\nu \vrule depth3pt width0pt \over 2}\right)}} {1\over{\sqrt{\nu \pi }}} {1\over{\left( 1 + \left({x^2\over \nu }\right)\right) ^{(\nu + 1)/2} }}$, $-\infty < x < \infty $, $ \nu = 1,2,\dots$. $$ Y\sim\chi^2_\nu \mbox{ and } X|y \sim \mbox{ N }(0, \nu/y) \Rightarrow X \sim t_{\nu }. $$ Also, if $X_1 \sim$ N$(0,1)$ and $X_2 \sim \chi^2_\nu$, then $X_1/\sqrt{X_2/\nu} \sim t_\nu$.
Uniform$(a,b)$
$f(x\vert a,b) = {1\over{b-a}}$, $ a \leq x \leq b$, $$ (b-a)U + a \sim \mbox{ Uniform}(a,b). $$
Weibull$(\gamma,\beta )$
$f(x\vert \gamma,\beta ) = { {\gamma }\over{\beta }}x^ {\gamma -1}e^{-x^{\gamma }/\beta }$, $ 0 \leq x < \infty $, $\gamma >0$, $ \beta > 0$. $$ X \sim {\rm Exponential}(\beta) \Rightarrow X^{1/\gamma } \sim {\rm Weibull}(\gamma,\beta ). $$
References
[1] | Besag, J. and Green, P.J. (1993) Spatial statistics and Bayesian computation (with discussion). J. Roy. Statist. Soc. B 55, 25--38. |
[2] | Damien, P., Wakefield, J. and Walker, S. (1999) Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. J. Roy. Statist. Soc. B 61(2), 331--344. |
[3] | Devroye, L. (1986) Non-Uniform Random Variate Generation\/. New York: Springer-Verlag. |
[4] | Fishman, G.S. (1996) Monte Carlo. New York: Springer-Verlag |
[5] | Gilks, W.R. and Wild, P. (1992) Adaptive rejection sampling for Gibbs sampling. Appl. Statist. 41, 337--348. |
[6] | Hastings, W. (1970). Monte Carlo Sampling Methods Using Markov Chains and their Application. Biometrika 57 97-109. |
[7] | Knuth, D. E. (1998). The Art of Computer Programming, Volume 2/ Seminumerical Algorithms, Third Edition. Addison Wesley, Reading. |
[8] | Marsaglia, G. (1977) The squeeze method for generating gamma variables. Computers and Mathematics with Applications, 3, 321--325. |
[9] | Marsaglia, G. and Zaman, A. (1991). A new class of random number generators. Annals Applied Probability~1, pp. 462--480. |
[10] | Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E. (1953) Equations of state calculations by fast computing machines. J. Chem. Phys. 21, 1087--1092. |
[11] | Ripley, B.D. (1987) Stochastic Simulation. New York: John Wiley |
[12] | Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, Second Edition. New York: Springer-Verlag. |
[13] | Robert, C. P. and Casella, G. (2010). An Introduction to Monte Carlo Methods with R. New York: Springer-Verlag. |
[14] | Rubinstein, R.Y. (1981) Simulation and the Monte Carlo Method. New York: John Wiley |
Simulation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Simulation&oldid=37934