# Monte-Carlo method

Jump to: navigation, search

method of statistical trials

A numerical method based on simulation by random variables and the construction of statistical estimators for the unknown quantities. It is usually supposed that the Monte-Carlo method originated in 1949 (see ) when, in connection with work on the construction of atomic reactors, J. von Neumann and S. Ulam suggested using the apparatus of probability theory in the computer solution of applied problems. The Monte-Carlo method is named after the town of Monte-Carlo, famous for its casinos.

## Simulation by random variables with given distributions.

As a rule such a simulation is achieved by a transformation of one or more independent values of a random number $\alpha$, uniformly distributed in the interval $( 0 , 1 )$. The sequence of "sample" values of $\alpha$ is usually obtained on a computer by number-theoretic algorithms, of which the most widely used is the so-called method of residues, for example, in the form

$$u _ {0} = 1 ,\ \ u _ {n} \equiv u _ {n-} 1 5 ^ {2p+} 1 \ ( \mathop{\rm mod} 2 ^ {m} ) ,\ \ \alpha _ {n} = u _ {n} \cdot 2 ^ {-} m .$$

Here $m$ is the order of the mantissa of the computer and

$$p = \max \{ {q } : {5 ^ {2q+} 1 < 2 ^ {m} } \} .$$

Numbers of this type are called pseudo-random numbers; they are used in statistical testing and in solving typical problems (see ). The length of the period of the above version of the method of residues is $2 ^ {m-} 2$. Physical generators, tables of random numbers and quasi-random numbers are also used in the Monte-Carlo method. There are Monte-Carlo methods with a small number of playing parameters (see ).

The standard method for simulating a discrete random variable with distribution ${\mathsf P} \{ \xi = x _ {k} \} = p _ {k}$, $k = 0 , 1 \dots$ is as follows: Put $\xi = x _ {m}$ if the chosen value of $\alpha$ satisfies

$$\sum _ { k= } 0 ^ { m- } 1 p _ {k} \leq \alpha < \ \sum _ { k= } 0 ^ { m } p _ {k} .$$

The standard method for simulating a continuous random variable (sometimes called the method of inverse functions) is to use the, easily verified, representation $\xi = F ^ { - 1 } ( \alpha )$, where $F$ is the distribution function with given density $f$. Sometimes randomization of the simulation is useful (in other words, the method of superposition), based on the expression

$$f ( x) = \sum _ { k } p _ {k} f _ {k} ( x) ;$$

here one first chooses a number $m$ with distribution ${\mathsf P} \{ m = k \} = p _ {k}$, and then obtains a sample value $\xi$ from the distribution with density $f _ {m}$. In other methods of randomization certain parameters of a deterministic method for solving the problem are considered as random variables (see ).

Another, more general, method for simulating a continuous random variable is the method of exclusion (method of selection), at the basis of which lies the following result: If $( \xi , \eta )$ is uniformly distributed in a domain $G = \{ {( x , y ) } : {0 \leq y \leq g ( x) } \}$, then $f _ \xi ( x) = g ( x) / \overline{G}\;$. In the method of exclusion, choose a point $( \xi _ {0} , \eta )$ uniformly in a domain $G _ {1} \supset G$ and put $\xi = \xi _ {0}$ if $( \xi _ {0} , \eta ) \in G$; otherwise repeat the selection of $( \xi _ {0} , \eta )$, etc. For example, if $a \leq \xi \leq b$ and $g ( x) = c f ( x) \leq R$, one can take $\xi _ {0} = a + ( b - a ) \alpha _ {1}$, $\eta = R \alpha _ {2}$. The average number of operations in the method of exclusion is proportional to $\overline{G}\; _ {1} / \overline{G}\;$.

For many random variables, special representations of the form $\xi = \phi ( \alpha _ {1} \dots \alpha _ {n} )$ have been obtained. For example, the random variables

$$\sqrt {- 2 \mathop{\rm ln} \alpha _ {1} } \cdot \cos 2 \pi \alpha _ {2} \ \ \textrm{ and } \ \ \sqrt {- 2 \mathop{\rm ln} \alpha _ {2} } \cdot \sin 2 \pi \alpha _ {2}$$

have standard normal distributions and are independent; the random variable $\mathop{\rm ln} ( \alpha _ {1} \dots \alpha _ {n} )$ has the gamma-distribution with parameter $n$; the random variable $\max ( \alpha _ {1} \dots \alpha _ {n} )$ is distributed with density $n x ^ {n-} 1$, $0 \leq x \leq 1$; the random variable $\mathop{\rm exp} \{ \sum _ {k=} 1 ^ {n} ( \mathop{\rm ln} \alpha ) / ( p + k + 1 ) \}$ has the beta-distribution with parameters $p , n$( see ).

The standard algorithm for simulating a continuous random vector $\\vec{xi} = ( \xi _ {1} \dots \xi _ {n} )$ is to successively choose the values of its components from conditional distributions corresponding to the representation

$$f {} _ {\\vec{xi} } ( x _ {1} \dots x _ {n} ) = \ f _ {1} ( x _ {1} ) f _ {2} ( x _ {2} \mid x _ {1} ) \dots f _ {n} ( x _ {n} \mid x _ {1} \dots x _ {n-} 1 ) .$$

The method of exclusion extends to the multi-dimensional case without change; it is only necessary, in its formulation, to regard $\xi$, $\xi _ {0}$ and $x$ as vectors. A multi-dimensional normal vector can be simulated by using a special linear transformation of a vector of independent standard normal random variables. Special methods have also been developed for the approximate simulation of stationary Gaussian processes (see, for example, , ).

If, in a Monte-Carlo method calculation, random variables defined by a real phenomenon are simulated, then the calculation is a direct simulation (imitation) of this phenomenon. Computer simulations have been worked out for: the processes of transport, scattering and reproduction of particles: neutrons, gamma-quanta, photons, electrons, and others (see, for example, ); simulations of the evolution of ensembles of molecules for the solution of various problems in classical and quantum statistical physics (see, for example, ); simulation of queueing and industrial processes (see, for example , , ); simulation of various random processes in technology, biology; etc. (see ). Simulation algorithms are usually carefully developed; for example, they tabulate complicated functions, modify standard procedures, etc. None-the-less, a direct simulation often cannot provide the necessary accuracy in the estimates of the required variables. Many methods have been developed for increasing the effectiveness of a simulation.

## Monte-Carlo algorithms for estimating multiple integrals.

Suppose it is required to estimate an integral $J = \int h ( x) d x$ with respect to the Lebesgue measure in an $s$- dimensional Euclidean space $X$ and let $f _ \xi ( x)$ be a probability density such that $J$ can be written as:

$$J = \int\limits _ { X } f _ \xi ( x) \frac{h ( x) }{f _ \xi ( x) } \ d x = {\mathsf E} \zeta ,$$

where $\zeta = h ( \xi ) / f _ \xi ( \xi )$. By computer simulation of $\xi$ it is possible to obtain $N$ sample values $x _ {1} \dots x _ {N}$. By the law of large numbers,

$$J \approx J _ {N} = \ \frac{1}{N} \sum _ { k= } 1 ^ { N } \frac{h ( x _ {k} ) }{f ( x _ {k} ) } .$$

Simultaneously it is possible to estimate the mean-square error in $J _ {N}$, that is, the quantity $\sigma _ {N} = ( {\mathsf D} \zeta / N ) ^ {1/2}$, and to approximately construct a suitable confidence interval for $J$. By the choice of the density $f$ it is possible to arrange for estimates with possibly smaller variance. For example, if $0 \leq m _ {1} \leq h / f \leq m _ {2} < + \infty$, then ${\mathsf D} \zeta \leq ( m _ {2} - m _ {1} ) ^ {2} / 4$ and if $f = h / J$, then ${\mathsf D} \zeta = 0$. The corresponding algorithm is called essential sampling (choice by importance). Another common modification — the method of selection of principal part — occurs when a function $h _ {0} \approx h$ is determined whose integral is known. It is sometimes useful to combine Monte-Carlo methods and classical quadratures (cf. Quadrature) in so-called random quadrature formulas, the basic idea of which is that the nodes and coefficients in any quadrature sum (for example, in interpolation) are chosen randomly from a distribution, providing an unbiased estimator of the integral . Particular cases of these formulas are: the so-called method of stratified sampling, in which the nodes are chosen one in each part of a fixed partition of the domain of integration and the coefficients are proportional to the corresponding volumes; and the so-called method of symmetric sampling, which, in the case of integration over $( 0 , 1 )$, is defined by the expression (see )

$$2 J = {\mathsf E} \left [ \frac{h ( \xi ) + h ( 1 - \xi ) }{f _ \xi ( \xi ) } \right ] .$$

Here the order of the speed of convergence of the Monte-Carlo method is increased and, in certain cases, becomes best possible in the class of problems being considered.

In general, the domain of integration is partitioned into parallelopipeds. In each parallelopiped the value of the integral is calculated as the average of the value at a random point and the point symmetric to it relative to the centre of the parallelopiped.

A number of modifications of Monte-Carlo methods are based on the (perhaps formal) representation of the required value as a double integral

$$J = \int\limits _ { X } \int\limits _ { Y } f ( x , y ) h ( x , y ) d x d y = \ {\mathsf E} \zeta ,$$

where $\zeta = h ( \xi , \eta )$ and the vector $( \xi , \eta )$ is distributed with density $f ( x , y )$. It is known that ${\mathsf E} ( \zeta ) = {\mathsf E} {\mathsf E} ( \zeta \mid \xi )$ and that

$$\tag{1 } {\mathsf D} \zeta = {\mathsf D} {\mathsf E} ( \zeta \mid \xi ) + {\mathsf E} {\mathsf D} ( \zeta \mid \xi ) = A _ {1} + A _ {2} ,$$

where ${\mathsf E} ( \zeta \mid \xi )$ is the conditional mathematical expectation and ${\mathsf D} ( \zeta \mid \xi )$ is the conditional variance of $\zeta$ given a fixed value of $\xi$. Formula (1) is widely used in Monte-Carlo methods. In particular, it shows that ${\mathsf D} {\mathsf E} ( \zeta \mid \xi ) < {\mathsf D} \zeta$, that is, analytic averaging over any variable increases the accuracy of the Monte-Carlo method. However, in this connection, the amount of computation may be significantly increased. The computer time necessary to achieve a given accuracy is proportional to $t {\mathsf D} \zeta$, where $t$ is the average time it takes to obtain one value of $\zeta$. The method of splitting is optimal with respect to this criterion. Its simplest version uses the "unbiased" estimator

$$\zeta _ {n} = \ \frac{1}{n} \sum _ { k= } 1 ^ { n } h ( \xi , \eta _ {k} ) ,$$

where $\eta _ {1} \dots \eta _ {n}$ are conditionally-independent and distributed as $\eta$ for a fixed value of $\xi$. Using (1) it is possible to obtain an optimal value

$$n = \left [ \frac{A _ {2} }{A _ {1} } \frac{t _ {1} }{t _ {2} } \right ] ^ {1/2} ,$$

where $t _ {1} , t _ {2}$ are the average computing times corresponding to the samples $\xi , \eta$( see, for example, ).

If the integrand depends on a parameter, it is expedient to use the method of dependent trials, that is, to estimate the integral for various values of the parameter at one and the same random points . An important property of the Monte-Carlo method is the comparatively relatively weak dependence of the mean-square error $\sigma _ {N}$ on the number of measurements; moreover, the order of convergence relative to the number of points $N$ is always one and the same: $N ^ {-} 1/2$. This allows one to estimate (after a preliminary transformation of the problem) integrals of very high, and even infinite, multiplicity. For example, methods have been worked out for the estimation of Wiener integrals .

## Monte-Carlo algorithms for solving integral equations of the second kind.

Suppose it is required to estimate a linear functional $J _ {h} = ( \phi , h )$, where $\phi = K \phi + f$, where the integral operator $K$ with kernel $k ( x ^ \prime , x )$ satisfies a condition providing convergence of the Neumann series: $\| K ^ {n _ {0} } \| < 1$. A Markov chain $\{ x _ {n} \}$ is defined by an initial density $\pi ( x)$ and a transition density $p ( x ^ \prime , x ) = p ( x ^ \prime \rightarrow x )$; the probability of termination of the chain at $x ^ \prime$ is equal to

$$g ( x ^ \prime ) = 1 - \int\limits p ( x ^ \prime , x ) d x .$$

Let $N$ be the random index of the last state. Further, let a functional of the trajectories of the chain with expectation $J _ {h}$ be defined. Most often the so-called collision estimator

$$\xi = \sum _ { n= } 0 ^ { N } Q _ {n} h ( x _ {n} )$$

is used, where

$$Q _ {0} = \ \frac{f ( x _ {0} ) }{\pi ( x _ {0} ) } ,\ \ Q _ {n} = Q _ {n-} 1 \frac{k ( x _ {n-} 1 , x _ {n} ) }{p ( x _ {n-} 1 , x _ {n} ) } .$$

If $p ( x ^ \prime , x ) \neq 0$ for $k ( x ^ \prime , x ) \neq 0$ and $\pi ( x) \neq 0$ for $f ( x) \neq 0$, then under certain additional conditions

$${\mathsf E} \xi = \ \sum _ { n= } 0 ^ \infty ( K ^ {n} f , h ) = \ ( \phi , h ) = \ \int\limits _ { X } \phi ( x) h ( x) d x$$

(see ). The possibility of attaining a small variance in the case of constant sign is ensured by the following result: If

$$\pi ( x) = \ \frac{f ( x) \phi ^ {*} ( x) }{( f , \phi ^ {*} ) } \ \ \textrm{ and } \ \ p ( x ^ \prime , x ) = \ \frac{k ( x ^ \prime , x ) \phi ^ {*} ( x) }{[ K ^ {*} \phi ^ {*} ] ( x ^ \prime ) } ,$$

where $\phi ^ {*} = K ^ {*} \phi ^ {*} + h$, then ${\mathsf D} \xi = 0$ and ${\mathsf E} \xi = J _ {h}$( see ). If a suitable Markov chain is simulated on a computer, statistical estimates of linear functionals in the solution of an integral equation of the second kind can be obtained. This gives the possibility of locally estimating the solution on the basis of the representation $\phi ( x) = ( \phi , h _ {x} ) + f ( x)$, where $h _ {x} ( x ^ \prime ) = k ( x ^ \prime , x )$. In a number of cases, alongside Monte-Carlo methods, number-theoretic methods are applied in order to solve these problems (see ). A Monte-Carlo method for estimating the first eigen value of an integral operator has been realized by an iteration method on the basis of the relation 

$${\mathsf E} [ Q _ {n} h ( x _ {n} ) ] = \ ( K ^ {n} f , h ) .$$

All these results can be almost automatically extended to systems of linear algebraic equations of the form $x + H x = h$( see ).

## Modifications of Monte-Carlo methods in radiative transport theory.

(See .) The density of the average number of particle collisions in a phase space with coordinates $\mathbf r$ and velocities $\omega$ satisfies an integral equation of the second kind; its kernel in the single-velocity case has the form

$$\frac{\sigma _ {s} ( \mathbf r ^ \prime ) g ( \mu ) \mathop{\rm exp} ( - \tau ( \mathbf r ^ \prime , \mathbf r ) ) \sigma ( \mathbf r ) }{\sigma ( \mathbf r ^ \prime ) 2 \pi | \mathbf r ^ \prime - \mathbf r | ^ {2} } \delta \left ( \omega - \frac{\mathbf r - \mathbf r ^ \prime }{| \mathbf r - \mathbf r ^ \prime | } \right ) .$$

Here $\sigma _ {s} ( \mathbf r )$ is the scattering coefficient (section), $\sigma ( \mathbf r )$ is the relaxation coefficient, $g ( \mu )$ is the scattering indicatrix, and $\tau ( \mathbf r ^ \prime , \mathbf r )$ is the optical length of a path from $\mathbf r ^ \prime$ to $\mathbf r$( see , ). For the construction of estimates with small variances one uses, for example, asymptotic solutions of the adjoint transport equation ; the simplest algorithm of this type is the so-called exponential transformation (see , ). A modification of a local estimate of the flow of particles has been developed (see , , , , ). Using a simulation of a Markov chain (for example, the physical transport process in a certain medium) it is possible to simultaneously obtain dependent estimators of functionals for different values of the parameters; by differentiating the "weight" $Q _ {n}$ it is sometimes possible to obtain unbiased estimators of the corresponding derivatives (see , ). This provides an opportunity to use Monte-Carlo methods in solving certain inverse problems . For the solution of problems in transport theory "bifurcation" of the trajectory and analytic averaging is effective . The simulation of trajectories of particles in compound media can sometimes be essentially simplified by the method of a maximal section (see ).

These are constructed on the basis of the corresponding integral relations. For example, the standard five-point difference approximation for the Laplace equation has the form of the formula for the complete mathematical expectation of a symmetric random walk over a grid with absorption at the boundary (see, for example, , ). A continuous analogue of this formula is the relation

$$\tag{2 } u ( P) = \ \frac{\int\limits _ {S ( P) } u ( \mathbf r ( s) ) d s }{\int\limits _ {S ( P) } d s } ,$$

where the integral is taken over the surface of a sphere lying entirely within the given domain and with centre at $P$. Formula (2) and other similar relations provide an opportunity to use isotropic "random walk on spheres" when solving elliptic and parabolic equations (see , ). Monte-Carlo methods are effective, for example, for estimating the solution of multi-dimensional boundary value problems at a point.

A simulation of Markov branching processes allows one to construct estimates of the solution of certain non-linear equations, for example, the Boltzmann equation in the theory of rarefied gases .

How to Cite This Entry:
Monte-Carlo method. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Monte-Carlo_method&oldid=47895
This article was adapted from an original article by G.A. Mikhailov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article