# Limit theorems

in probability theory

2010 Mathematics Subject Classification: Primary: 60Fxx [MSN][ZBL]

A general name for a number of theorems in probability theory that give conditions for the appearance of some regularity as the result of the action of a large number of random sources. The first limit theorems, established by J. Bernoulli (1713) and P. Laplace (1812), are related to the distribution of the deviation of the frequency $\mu _ {n} /n$ of appearance of some event $E$ in $n$ independent trials from its probability $p$, $0 < p < 1$( exact statements can be found in the articles Bernoulli theorem; Laplace theorem). S. Poisson (1837) generalized these theorems to the case when the probability $p _ {k}$ of appearance of $E$ in the $k$- th trial depends on $k$, by writing down the limiting behaviour, as $n \rightarrow \infty$, of the distribution of the deviation of $\mu _ {n} /n$ from the arithmetic mean $\overline{p}\; = ( \sum _ {k = 1 } ^ {n} p _ {k} )/n$ of the probabilities $p _ {k}$, $1 \leq k \leq n$( cf. Poisson theorem). If $X _ {k}$ denotes the random variable that takes the value 1 if $E$ appears in the $k$- th trial and the value 0 when the opposite event appears, then $\mu _ {n}$ can be expressed as the sum

$$\mu _ {n} = \ X _ {1} + \dots + X _ {n} ,$$

which makes it possible to regard the theorems mentioned above as particular cases of two more general statements related to sums of independent random variables — the law of large numbers and the central limit theorem (these are given in their classical forms below).

## The law of large numbers.

Let

$$\tag{1 } X _ {1} , X _ {2} \dots$$

be a sequence of independent random variables, let $s _ {n}$ be the sum of the first $n$ elements of this sequence,

$$\tag{2 } s _ {n} = \ X _ {1} + \dots + X _ {n} ,$$

let $A _ {n}$ and $B _ {n} ^ {2}$ be, respectively, the mathematical expectation,

$$A _ {n} = \ {\mathsf E} s _ {n} = \ {\mathsf E} X _ {1} + \dots + {\mathsf E} X _ {n} ,$$

and variance (cf. Dispersion),

$$B _ {n} ^ {2} = \ {\mathsf D} s _ {n} = \ {\mathsf D} X _ {1} + \dots + {\mathsf D} X _ {n} ,$$

of the sum $s _ {n}$. One says that the sequence (1) is subject to the law of large numbers if, for any $\epsilon > 0$, the probability of the inequality

$$\left | { \frac{s _ {n} }{n} } - { \frac{A _ {n} }{n} } \ \right | > \epsilon$$

tends to zero as $n \rightarrow \infty$.

Very general conditions for the law of large numbers to be applicable were found first by P.L. Chebyshev (1867) and were later generalized by A.A. Markov (1906). The problem of necessary and sufficient conditions for the law of large numbers to be applicable was exhaustively treated by A.N. Kolmogorov (1928). If all random variables have the same distribution function, then these conditions reduce to one: the $X _ {n}$ must have finite mathematical expectation (this was shown by A.Ya. Khinchin in 1929).

## The central limit theorem.

One says that the central limit theorem holds for a sequence (1) if for arbitrary $z _ {1}$ and $z _ {2}$ the probability of the inequality

$$z _ {1} B _ {n} < \ s _ {n} - A _ {n} < \ z _ {2} B _ {n}$$

has as limit, as $n \rightarrow \infty$, the quantity

$$\Phi ( z _ {2} ) - \Phi ( z _ {1} ),$$

where

$$\Phi ( z) = \ { \frac{1}{\sqrt {2 \pi } } } \int\limits _ {- \infty } ^ { z } e ^ {- x ^ {2} /2 } dx$$

(cf. Normal distribution). Rather general sufficient conditions for the central limit theorem to hold were indicated by Chebyshev (1887); however, his proofs contained gaps, which were filled in somewhat later by Markov (1898). A solution of the problem which is nearly final was obtained by A.M. Lyapunov (1901). The exact formulation of Lyapunov's theorem is: Suppose

$$c _ {k} = {\mathsf E} | X _ {k} - {\mathsf E} X _ {k} | ^ {2 + \delta } ,\ \ \delta > 0,$$

$$C _ {n} = c _ {1} + \dots + c _ {n} .$$

If the ratio $L _ {n} = C _ {n} /B _ {n} ^ {2 + \delta }$ tends to zero as $n \rightarrow \infty$, then the central limit theorem holds for (1). The final solution to the problem of conditions of applicability of the central limit theorem was obtained, in general outline, by S.N. Bernstein [S.N. Bernshtein] (1926) and was completed by W. Feller (1935). Under the conditions of the central limit theorem the relative accuracy of approximation of the probability of an inequality of the form $S _ {n} - A _ {n} > z _ {n} B _ {n}$, where $z _ {n}$ grows unboundedly as $n$ tends to infinity, by $1 - \Phi _ {n} ( z _ {n} )$ can be very low. Correction factors necessary in order to increase the accuracy are indicated in limit theorems for probabilities of large deviations (cf. Probability of large deviations; Cramér theorem). This question was studied, following H. Cramér and Feller, by Yu.V. Linnik and others. Typical results related to this branch of the subject are most conveniently explained using the example of the sums (2) of independent identically-distributed random variables $X _ {1} , X _ {2} \dots$ with ${\mathsf E} X _ {j} = 0$ and ${\mathsf D} X _ {j} = 1$. In this case $A _ {n} = 0$, $B _ {n} = \sqrt n$.

Consider, e.g., the probability of the inequality

$$s _ {n} \geq \ z _ {n} \sqrt n ,$$

which equals $1 - F _ {n} ( z _ {n} )$, where $F _ {n} ( z _ {n} )$ is the distribution function of the variable $s _ {n} / \sqrt n$, and for fixed $z _ {n} = z$ as $n \rightarrow \infty$,

$$\tag{3 } 1 - F _ {n} ( z) \rightarrow \ 1 - \Phi ( z).$$

If $z _ {n}$ depends on $n$ and is moreover such that $z _ {n} \rightarrow \infty$ as $n \rightarrow \infty$, then

$$1 - F _ {n} ( z _ {n} ) \rightarrow 0 \ \ \textrm{ and } \ \ 1 - \Phi ( z _ {n} ) \rightarrow 0$$

and formula (3) is useless. It is necessary to obtain bounds on the relative accuracy of approximation, i.e. for the ratio of $1 - F ( z _ {n} )$ to $1 - \Phi ( z _ {n} )$. In particular, there naturally arises the question of conditions under which

$$\tag{4 } \frac{1 - F _ {n} ( z _ {n} ) }{1 - \Phi ( z _ {n} ) } \rightarrow 1$$

as $z _ {n} \rightarrow \infty$.

Relation (4) holds for $z _ {n}$ of arbitrary growth only if the summands have a normal distribution (this conclusion is valid as soon as $z _ {n}$ is of order greater than $\sqrt n$). If the summands are not normal, then relation (4) can hold in certain zones, whose orders do not exceed $\sqrt n$. "Smaller" zones (of logarithmic order) are obtained under the condition that a certain number of moments is finite. Moreover, under definite "regularity" conditions on the densities of the summands the "normality" asymptotics imply power asymptotics. E.g., if the density of the summands is

$${ \frac{2} \pi } { \frac{1}{( 1 + z ^ {2} ) ^ {2} } } ,$$

then, uniformly in $z$, as $n \rightarrow \infty$,

$${\mathsf P} \left \{ \frac{s _ {n} }{\sqrt n } \geq z \right \} \sim \ 1 - \Phi ( z) + { \frac{2}{3 \pi } } { \frac{1}{\sqrt n } } { \frac{1}{z ^ {3} } } .$$

Taking into account that as $z \rightarrow \infty$,

$$1 - \Phi ( z) \sim \ { \frac{1}{\sqrt {2 \pi } z } } e ^ {- z ^ {2} /2 } ,$$

it is easy to convince oneself that (4) holds. The extension to zones of power order (of the form $n ^ \alpha$, $\alpha < 1/2$), requires that the condition

$$\tag{5 } {\mathsf E} e ^ {| X _ {j} | \frac{4 \alpha }{2 \alpha + 1 } } < \infty$$

should hold, as well as that a certain number (depending on $\alpha$) of moments of $X _ {j}$ coincide with corresponding moments of the normal distribution. If the latter condition (on the coincidence of moments) is not satisfied, then the ratio at the left-hand side of (4) can be described by a Cramér series (under the so-called Cramér condition, cf. Cramér theorem) or initial partial sums of it (under a condition of the type (5)).

Estimates of probabilities of large deviations are used in mathematical statistics, statistical physics, etc.

The following may be distinguished among the other directions of research in the domain of limit theorems.

1) Research, initiated by Markov and continued by Bernstein and others, on conditions under which the law of large numbers and the central limit theorem hold for sums of dependent random variables.

2) Even in the case of sequences of identically-distributed random variables one can exhibit simple examples when "normalized" (i.e. subjected to a certain linear transformation) sums $( s _ {n} - a _ {n} ) / b _ {n}$, $a _ {n} , b _ {n} > 0$ constants, have a limit distribution different from a normal one (non-degenerate distributions, i.e. distributions not concentrated at a single point, are meant) (cf. Stable distribution). In the work of Khinchin, B.V. Gnedenko, P. Lévy, W. Doeblin, and others, both the class of possible distributions for sums of independent random variables and conditions of convergence of the distributions of sums to some limit distribution (in triangular arrays (cf. Triangular array) of random variables one imposes here the condition of asymptotic negligibility of the summands) have been completely studied. (Cf. Infinitely-divisible distribution; Stochastic process with independent increments.)

3) Local limit theorems have been given considerable attention. E.g., suppose that the random variables $X _ {n}$ take only integer values. Then the sums $s _ {n}$ also take integer values only, and it is natural to pose the question of the limiting behaviour of the probabilities $P _ {n} ( m)$ that $s _ {n} = m$, where $m$ is an integer. The simplest example of a local limit theorem is the local Laplace theorem. Another type of local limit theorem describes the limiting distribution of the densities of the distributions of sums.

4) Limit theorems in their classical formulation describe the behaviour of individual sums $s _ {n}$ as $n$ grows. Sufficiently general limit theorems for probabilities of events depending on several sums at once were first obtained by Kolmogorov (1931). His results imply, e.g., that under fairly general conditions the probability of the inequality

$$\max _ {1 \leq k \leq n } \ | s _ {k} | < zB _ {n}$$

has as limit the quantity

$${ \frac{4} \pi } \sum _ {k = 0 } ^ \infty (- 1) ^ {k} { \frac{1}{2k + 1 } } e ^ {- ( 2k + 1) ^ {2} z ^ {2} /8 \pi ^ {2} } ,\ \ z > 0.$$

A most general means for proving analogous limit theorems is by limit transition from discrete to continuous processes.

5) The limit theorems given above are related to sums of random variables. An example of a limit theorem of different kind is given by limit theorems for order statistics. These theorems have been studied in detail by Gnedenko, N.V. Smirnov and others.

6) Finally, theorems establishing properties of sequences of random variables occurring with probability one are called strong limit theorems. (Cf. Strong law of large numbers; Law of the iterated logarithm.)

For methods of proof of limit theorems see Characteristic function; Distributions, convergence of.