# Statistical experiments, method of

A method of numerical calculation that interprets required unknown values as characteristics of a convenient (related) random phenomenon $\phi$; this phenomenon is simulated numerically, whereafter the required values are estimated using the simulation of observations of $\phi$. As a rule, an unknown $z$ is sought in the form of the mathematical expectation $z = {\mathsf E} Z( \omega )$ of some random variable $Z( \omega )$ on a probability space $( \Omega , {\mathcal A} , {\mathsf P})$ that describes $\phi$, and the independent observations $\phi$ are simulated (see Independence). Then, by the law of large numbers

$$z \approx \zeta _ {N} = \ \frac{Z( \omega ^ {( 1)} ) + \dots + Z( \omega ^ {( N)} ) }{N} .$$

When ${\mathsf E} | Z( \omega ) | ^ {2} < \infty$, the random error of this formula can be roughly estimated in probability using the Chebyshev inequality or, asymptotically, by the central limit theorem

$$\tag{1 } {\mathsf P} \left \{ | z - \zeta _ {N} | < \frac{a \sigma ( Z) }{N ^ {1/2} } \right \} \approx \ \mathop{\rm erf} ( a) ,\ \$$

$$\sigma ^ {2} = {\mathsf E} | Z( \omega ) | ^ {2} - [ {\mathsf E} Z( \omega )] ^ {2} .$$

The mathematical expectation ${\mathsf E} | Z( \omega ) | ^ {2}$ can also be estimated "by simulation" , which enables one to make an a posteriori confidence estimation of the accuracy of the calculation. The random phenomenon is usually simulated by means of a sequence of independent random numbers, uniformly distributed on the interval $\{ {x } : {0 \leq x \leq 1 } \}$ (cf. Uniform distribution). For this purpose a measurable mapping $f$ from the unit hypercube of countable dimension

$$H = \mathop{\times} _ { k= 1} ^ \infty \{ {x _ {k} } : {0 \leq x _ {k} \leq 1 } \} \ \ \left ( \textrm{ with Lebesgue volume } d V = \prod _ { k } dx _ {k} \right )$$

onto $( \Omega , {\mathcal A} , {\mathsf P})$ is used; $\Omega = f( H)$, $P = Vf ^ { - 1 }$, where the function $f$ depends essentially only on coordinates with small indices. The problem thus formally reduces to the calculation of the integral

$$\int\limits _ { H } Z( f( x)) dV$$

using the simplest quadrature formula with equal weights and random abscissae $\mathbf x ^ {( n)}$. It follows from (1) that the amount of calculation needed to achieve the desired accuracy $\epsilon$ of the calculation of $z$ is determined, given a fixed confidence level $\mathop{\rm erf} ( a)$, by the product $N \tau = \epsilon ^ {- 2} a ^ {2} \sigma ^ {2} ( Z) \tau ( Z)$, where $\tau$ is the mathematical expectation of the amount of calculation needed to construct a single realization of $Z( \omega )$; it increases rapidly as $\epsilon$ diminishes. A successful choice of a model with sufficiently small $\sigma ^ {2} \tau$ is therefore of great value. In particular, it may prove more useful in the original integral representation to make a priori an analytic integration over some of the variables $x _ {i}$, change some other variables, break the integration cube down into parts, separate the main part of the integral, use groups of dependent points $\mathbf x ^ {n}$ which give the exact quadrature formula for any desired class of functions, etc. The most advantageous "model" can be chosen by estimating roughly the values of $\sigma ^ {2}$ and $\tau$ in small preliminary numerical experiments. In making a series of calculations, a noticeable higher degree of accuracy can be obtained by appropriate statistical processing of "observations" and by choosing a corresponding program of "experiments" .

A large class of models used in the method of statistical experiments is related to the scheme of random walks. In the simplest case, $B$ is a square matrix of order $m$, $b _ {ij} = r _ {ij} \cdot p _ {ij}$, where $| r _ {ij} | < 1$, $p _ {ij} \geq 0$, $1 \leq i, j \leq m$; $p _ {ij} + \dots + p _ {im} < 1$, $i = 1 \dots m$. Consider a Markov random walk $w = \{ \theta ( 0), \dots \theta ( \nu ) \}$ through $m$ states $\theta _ {1} \dots \theta _ {m}$, with transition probabilities $p _ {ij}$ from $\theta _ {i}$ to $\theta _ {j}$, up to the transition at a random $( \nu + 1 )$-st step to an extra absorbing state $\theta _ {0}$, with absorption probability $p _ {i0} = 1 - ( p _ {i1} + \dots + p _ {im} )$, $p _ {00} = 1$. Under the assumption that the moving particle changes its weight according to the rule $\rho _ {k} = \rho _ {k-} 1 r _ {ij}$, if the $k$-th random transition was from $\theta _ {i}$ to $\theta _ {j}$, $\rho _ {0} = 1$, the solution of the equation $( \mathbf 1 - B) y = g$ using a Neumann series can be interpreted coordinatewise as

$$\tag{2 } y _ {e} = g _ {e} + ( Bg) _ {e} + ( B ^ {2} g) _ {e} + \dots =$$

$$= \ {\mathsf E} [ g( \theta ( 0)) + \rho _ {1} g ( \theta ( 1)) + \dots + \rho _ \nu g( \theta ( \nu ))],$$

where $\theta ( 0) = \theta _ {e}$, $g( \theta _ {s} ) = g _ {s}$, $s = 1 \dots m$. Every "trajectory" $\omega ^ {( k)}$ is simulated by its sequence of random numbers $x _ {n} ^ {( k)}$; the transition to $\theta _ {j}$ is completed at the $n$-th step from $\theta ( n- 1) = \theta _ {i}$ when $p _ {i0} + \dots + p _ {i,j-} 1 \leq x _ {n} ^ {( j)} \leq p _ {i0} + \dots + p _ {ij}$. The amount of work involved in constructing the trajectory and calculating the functional from it is proportional to its "length" $\nu$; in this scheme ${\mathsf E} \nu < \infty$.

When simulating random walks in continuous time, the motion must be made discrete. Suppose it is necessary to calculate the fraction $b$ of radiation emanating from a sphere of radius $R$, at the centre of which a source is situated. The motion of the radiated particles is rectilinear; on a path $ds$ with probability $a ds$ a particle interacts with the medium, so that it is absorbed with probability $1- q$, and is spherically-symmetrically dispersed with probability $q$. The problem is solved by simulating the "particle" trajectories corresponding to the given stochastic differential description of the motion. Instead of breaking down the approximate path of the particle into steps $\Delta s$ and testing at each step whether interaction has taken place, it is possible, by means of the exponential distribution with density $p( s) = a \mathop{\rm exp} (- as)$, $s\geq 0$, to generate the length of an $n$-th random run $s _ {n}$ by means of a single random number, and find the next point of interaction $\mathbf r _ {n}$. Moreover, it is possible not to perform a type of interaction with the medium, but to allow for absorption by a weight factor according to the rule $\rho _ {n} = q \rho _ {n-} 1$. The polar and azimuthal angles $( \theta , \phi )$ of the new direction of the motion are then looked for; $\cos \theta$ is distributed with uniform probability on the interval $[- 1, 1]$, and $\phi$ is distributed with uniform probability on the semi-interval $[ 0, 2 \pi )$. They define the unit vector $\mathbf e _ {n}$ of the new direction of the motion. The simulation continues until the "particle" leaves the sphere, i.e. until the first event $s _ {n} \geq l _ {n}$, where $l _ {n}$ is the length of the path up to the boundary of the sphere, $| \mathbf r _ {n} + l _ {n} \mathbf e _ {n} | = R$. The average weight of the "particles" that have left the sphere provides an estimate of $b$. The integral expression obtained for the required quantity (which also follows from the integral transport equation) can be transformed into an integral along those trajectories $\omega$ that do not leave the sphere. The run $s _ {n}$ must then be performed according to the conditional distribution with density $a[ 1- \mathop{\rm exp} (- al _ {n} )] ^ {- 1} \mathop{\rm exp} (- as)$; the new weight is defined by the rule $\rho _ {n+} 1 = q \rho _ {n} [ 1- \mathop{\rm exp} (- al _ {n} )]$, and on every trajectory the functional $\beta = \sum \rho _ {n} \mathop{\rm exp} (- al _ {n} )$ is calculated. Then $b = {\mathsf E} \beta$, where $\beta ( \omega ( \mathbf x ))$ is a continuous function within $H$. In this model, the trajectories are infinite, but the contribution of the later segments (those with a high number, if the segments are numbered beginning with the first one starting at the origin) is small; their simulation can therefore be stopped as soon as $\rho _ {n} \leq \delta$ by introducing a small systematic error into $b$. The described scheme gives quite good results when $aR \sim 1$. However, for large $R$ its use may lead to false conclusions. When $aR \gg 1$, departure from the sphere is rare, and is generally only achieved by trajectories all segments of which are long "on the average". If $N$ is not sufficiently large, then it is highly probable that these a-typical trajectories with a relatively large value of $\beta$ will not occur, and this may lead to underrated (though not to zero) estimates both of the required average $b$ and the variance ${\mathsf D} \beta$, i.e. an a posteriori measure of the error. Accuracy can be increased here, if an exponential transformation is used, by simulating trajectories by means of the exponential distribution with increased mean run and by compensating this by an extra exponential factor in the weight.

It follows from formula (2) that by solving a system of linear equations via the method of statistical experiments, it is possible to find approximately only one unknown variable without calculating the others. This important property justifies the use of the method of statistical experiments, in spite of its slow convergence, for example, in solving boundary value problems for elliptic differential equations of the second order, when a solution has to be found at only one given point. In particular, for the Laplace equation the solution is written in the form of an integral over Wiener trajectories, i.e. the trajectories of a Brownian motion. The solution of certain boundary value problems for the meta-harmonic (including biharmonic) equations can be written in the form of integrals over the space of random trajectories of a Brownian particle with matrix weight. The simulation of the Brownian trajectories themselves, which undergo an infinitely large number of collisions for any interval of time, can be constructed in large sections by explicit specific methods.

In solving non-linear equations by the method of statistical experiments, more complex models are used of flows of many particles that interact stochastically with the medium and with each other, including cascades of multiplying particles.

Apart from its slow convergence, this method has other shortcomings, including the inadequate reliability of the a posteriori estimation (1) of the random error. It can become less reliable as a result of both "poor quality" (e.g., correlation) of the random numbers used and "non-typicality" (e.g., low probability) of the results of $\omega$ making a main contribution to the integral.

Another name for the method of statistical experiments — the Monte-Carlo method — relates largely to the theory of modifying the method of statistical experiments.

For references, see Monte-Carlo method.