Namespaces
Variants
Actions

Difference between revisions of "Central limit theorem"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (MR/ZBL numbers added)
m (tex encoded by computer)
(3 intermediate revisions by one other user not shown)
Line 1: Line 1:
 +
<!--
 +
c0211801.png
 +
$#A+1 = 140 n = 0
 +
$#C+1 = 140 : ~/encyclopedia/old_files/data/C021/C.0201180 Central limit theorem
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
{{MSC|60F05}}
 
{{MSC|60F05}}
  
Line 7: Line 19:
 
The classical version of the central limit theorem is concerned with a sequence
 
The classical version of the central limit theorem is concerned with a sequence
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211801.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
$$ \tag{1 }
 +
X _ {1} \dots X _ {n} \dots
 +
$$
  
of independent random variables having finite (mathematical) expectations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211802.png" />, and finite variances <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211803.png" />, and with the sums
+
of independent random variables having finite (mathematical) expectations $  {\mathsf E} X _ {k} = a _ {k} $,  
 +
and finite variances $  {\mathsf D} X _ {k} = b _ {k} $,  
 +
and with the sums
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211804.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
S _ {n}  = \
 +
X _ {1} + \dots + X _ {n} .
 +
$$
  
Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211805.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211806.png" />. The distribution functions
+
Suppose that $  A _ {n} = {\mathsf E} S _ {n} = a _ {1} + \dots + a _ {n} $,  
 +
$  B _ {n} = {\mathsf D} S _ {n} = b _ {1} + \dots + b _ {n} $.  
 +
The distribution functions
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211807.png" /></td> </tr></table>
+
$$
 +
F _ {n} ( x)  = \
 +
{\mathsf P} \{ Z _ {n} < x \} ,
 +
$$
  
 
that is, the "normalized" sums
 
that is, the "normalized" sums
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211808.png" /></td> <td valign="top" style="width:5%;text-align:right;">(3)</td></tr></table>
+
$$ \tag{3 }
 +
Z _ {n}  = \
 +
 
 +
\frac{S _ {n} - A _ {n} }{\sqrt {B _ {n} } }
 +
,
 +
$$
  
 
which have expectation 0 and variance 1, are compared with the "standard" normal distribution function
 
which have expectation 0 and variance 1, are compared with the "standard" normal distribution function
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c0211809.png" /></td> </tr></table>
+
$$
 +
\Phi ( x)  = \
  
corresponding to the normal distribution with expectation 0 and variance 1. In this case the central limit theorem asserts that under certain conditions, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118010.png" />, for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118011.png" />,
+
\frac{1}{\sqrt {2 \pi } }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118012.png" /></td> </tr></table>
+
\int\limits _ {- \infty } ^ { x }
 +
e ^ {- z  ^ {2} /2 }  dz
 +
$$
  
or, what is the same, for any interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118013.png" />:
+
corresponding to the normal distribution with expectation 0 and variance 1. In this case the central limit theorem asserts that under certain conditions, as  $  n \rightarrow \infty $,  
 +
for any $  x \in \mathbf R $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118014.png" /></td> </tr></table>
+
$$
 +
F _ {n} ( x)  \rightarrow  \Phi ( x),
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118015.png" /></td> </tr></table>
+
or, what is the same, for any interval  $  ( \alpha , \beta ) $:
 +
 
 +
$$
 +
{\mathsf P} \{ \alpha < Z _ {n} < \beta \}  = \
 +
{\mathsf P} \{ A _ {n} + \alpha
 +
\sqrt {B _ {n} } < S _ {n} <
 +
A _ {n} + \beta \sqrt {B _ {n} } \} \rightarrow
 +
$$
 +
 
 +
$$
 +
\rightarrow \
 +
\Phi ( \beta ) - \Phi ( \alpha ),
 +
$$
  
 
(see [[Laplace theorem|Laplace theorem]]; [[Lyapunov theorem|Lyapunov theorem]]).
 
(see [[Laplace theorem|Laplace theorem]]; [[Lyapunov theorem|Lyapunov theorem]]).
  
A clearer understanding of conditions for the emergence of a normal distribution as the limit of distributions of sums of independent random variables comes about by consisting a [[Triangular array|triangular array]] of random variables instead of a sequence (see [[#References|[4]]]). In this case one considers for every <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118016.png" /> a sequence of variables
+
A clearer understanding of conditions for the emergence of a normal distribution as the limit of distributions of sums of independent random variables comes about by consisting a [[Triangular array|triangular array]] of random variables instead of a sequence (see {{Cite|GK}}). In this case one considers for every $  n = 1, 2 \dots $
 +
a sequence of variables
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118017.png" /></td> </tr></table>
+
$$
 +
X _ {n,1} \dots X _ {n,n} ,
 +
$$
  
 
putting
 
putting
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118018.png" /></td> </tr></table>
+
$$
 +
X _ {n,k}  = \
 +
 
 +
\frac{X _ {k} - a _ {k} }{\sqrt {B _ {n} } }
 +
,\ \
 +
1 \leq  k \leq  n.
 +
$$
  
 
Then the random variables inside each sequence (row) are independent, and
 
Then the random variables inside each sequence (row) are independent, and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118019.png" /></td> </tr></table>
+
$$
 +
Z _ {n}  = \
 +
X _ {n,1} + \dots + X _ {n,n} .
 +
$$
 +
 
 +
The usual conditions for applicability of the central limit theorem (such as Lyapunov's condition or the condition of the [[Lindeberg–Feller theorem|Lindeberg–Feller theorem]]) imply that  $  X _ {n,k} $
 +
is asymptotically negligible. For example, from Lyapunov's condition with third moments, that is, from the condition that as  $  n \rightarrow \infty $,
 +
 
 +
$$ \tag{4 }
 +
L _ {n}  = \
 +
 
 +
\frac{1}{B _ {n}  ^ {3/2} }
  
The usual conditions for applicability of the central limit theorem (such as Lyapunov's condition or the condition of the [[Lindeberg–Feller theorem|Lindeberg–Feller theorem]]) imply that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118020.png" /> is asymptotically negligible. For example, from Lyapunov's condition with third moments, that is, from the condition that as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118021.png" />,
+
\sum _ {n = 1 } ^ { n }
 +
{\mathsf E} | X _ {k} - a _ {k} |
 +
^ {3}  \rightarrow  0 ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118022.png" /></td> <td valign="top" style="width:5%;text-align:right;">(4)</td></tr></table>
+
for any  $  \epsilon > 0 $
 +
the inequality
  
for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118023.png" /> the inequality
+
$$
 +
\max _ {1 \leq  k \leq  n } \
 +
{\mathsf P} \{ | X _ {n,k} | > \epsilon \}  = \
 +
\max _ {1 \leq  k \leq  n } \
 +
{\mathsf P} \{ | X _ {k} - a _ {k} | >
 +
\epsilon \sqrt {B _ {n} } \} \leq
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118024.png" /></td> </tr></table>
+
$$
 +
\leq  \
 +
\max _ {1 \leq  k \leq  n } \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118025.png" /></td> </tr></table>
+
\frac{1}{\epsilon  ^ {3} B _ {n}  ^ {3/2} }
 +
{\mathsf E} | X _ {k} - a _ {k} |  ^ {3}  \leq  L _ {n}  \rightarrow  0
 +
$$
  
follows as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118026.png" />, and the fact that the quantities at the left-hand side of this chain of inequalities tend to zero indicates the asymptotic negligibility of the random variables forming the array.
+
follows as $  n \rightarrow \infty $,  
 +
and the fact that the quantities at the left-hand side of this chain of inequalities tend to zero indicates the asymptotic negligibility of the random variables forming the array.
  
 
Suppose now that
 
Suppose now that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118027.png" /></td> <td valign="top" style="width:5%;text-align:right;">(5)</td></tr></table>
+
$$ \tag{5 }
 +
X _ {n,1} \dots X _ {n,k _ {n}  } ,
 +
$$
  
<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118028.png" /> is an arbitrary triangular array of asymptotically-negligible random variables that are independent within each sequence. If the limit distribution for the sums <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118029.png" /> exists and is non-degenerate, then it is normal if and only if, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118030.png" />, for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118031.png" />,
+
$  n = 1, 2 \dots $
 +
is an arbitrary triangular array of asymptotically-negligible random variables that are independent within each sequence. If the limit distribution for the sums $  Z _ {n} = X _ {n,1} + \dots + X _ {n,k _ {n}  } $
 +
exists and is non-degenerate, then it is normal if and only if, as $  n \rightarrow \infty $,  
 +
for any $  \epsilon > 0 $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118032.png" /></td> <td valign="top" style="width:5%;text-align:right;">(6)</td></tr></table>
+
$$ \tag{6 }
 +
{\mathsf P} \left \{ \max _ {1 \leq  k \leq  k _ {n} } \
 +
| X _ {n,k} | > \epsilon \right \}  \rightarrow  0,
 +
$$
  
that is, if the maximal term in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118033.png" /> becomes vanishingly small in comparison with the whole sum. (Without condition (6) one can only assert that the limit law for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118034.png" /> belongs to the class of infinitely-divisible distributions, cf. [[Infinitely-divisible distribution|Infinitely-divisible distribution]].) Two additional conditions that together with (6) are necessary and sufficient for the convergence of the distributions of the sums <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118035.png" /> to a limit can be found in the article [[Triangular array|Triangular array]].
+
that is, if the maximal term in $  Z _ {n} $
 +
becomes vanishingly small in comparison with the whole sum. (Without condition (6) one can only assert that the limit law for $  Z _ {n} $
 +
belongs to the class of infinitely-divisible distributions, cf. [[Infinitely-divisible distribution|Infinitely-divisible distribution]].) Two additional conditions that together with (6) are necessary and sufficient for the convergence of the distributions of the sums $  Z _ {n} $
 +
to a limit can be found in the article [[Triangular array|Triangular array]].
  
When the condition of asymptotic negligibility of the variables in the triangular array considered above does not hold, the situation becomes complicated. The well-known theorem of H. Cramér that the sum of several independent random variables is normally distributed if and only if each of the summands is, makes it possible to assume (as P. Lévy did, see [[#References|[15]]], Chapt. 5, Theor. 38) that the sum of independent random variables has a distribution close to normal if the "large" terms are almost normal and if the collection of "small" terms is subject to the condition of "normality" of the distributions of the sums of the asymptotically-negligible terms. A precise form of an argument of this kind was first obtained for the triangular array (5) with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118036.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118037.png" /> (see [[#References|[7]]]). Here, for the convergence of the distribution functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118038.png" /> to the normal distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118039.png" /> it is necessary and sufficient that the following two conditions hold simultaneously:
+
When the condition of asymptotic negligibility of the variables in the triangular array considered above does not hold, the situation becomes complicated. The well-known theorem of H. Cramér that the sum of several independent random variables is normally distributed if and only if each of the summands is, makes it possible to assume (as P. Lévy did, see {{Cite|Le}}, Chapt. 5, Theor. 38) that the sum of independent random variables has a distribution close to normal if the "large" terms are almost normal and if the collection of "small" terms is subject to the condition of "normality" of the distributions of the sums of the asymptotically-negligible terms. A precise form of an argument of this kind was first obtained for the triangular array (5) with $  {\mathsf E} X _ {n,k} = 0 $,
 +
$  \sum _ {n = 1 }  ^ {k _ {n} } {\mathsf D} X _ {n,k} = 1 $(
 +
see {{Cite|Z}}). Here, for the convergence of the distribution functions $  F _ {n} ( x) = {\mathsf P} \{ Z _ {n} < x \} $
 +
to the normal distribution function $  \Phi ( x) $
 +
it is necessary and sufficient that the following two conditions hold simultaneously:
  
1) as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118040.png" />,
+
1) as $  n \rightarrow \infty $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118041.png" /></td> </tr></table>
+
$$
 +
\alpha _ {n}  = \
 +
\max _ {1 \leq  k \leq  k _ {n} } \
 +
L ( F _ {n,k} , \Phi _ {n,k} )  \rightarrow  0,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118042.png" /> is the Lévy distance (see [[Lévy metric|Lévy metric]]) between the distribution functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118043.png" /> of the random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118044.png" /> and the normal distribution functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118045.png" /> with the same expectation and variance as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118046.png" />; and
+
where $  L ( F _ {n,k} , \Phi _ {n,k} ) $
 +
is the Lévy distance (see [[Lévy metric|Lévy metric]]) between the distribution functions $  F _ {n,k} ( x) $
 +
of the random variables $  X _ {n,k} $
 +
and the normal distribution functions $  \Phi _ {n,k} ( x) $
 +
with the same expectation and variance as $  F _ {n,k} ( x) $;  
 +
and
  
2) for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118047.png" />, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118048.png" />,
+
2) for any $  \epsilon > 0 $,  
 +
as $  n \rightarrow \infty $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118049.png" /></td> </tr></table>
+
$$
 +
\Delta _ {n} ( \epsilon )  = \
 +
\sum _ { k = 1 } ^ { {k _ n } }
 +
\int\limits _ {| x | > \epsilon }
 +
x  ^ {2}  dF _ {n,k} ( x)  \rightarrow  0,
 +
$$
  
where the sum is over those <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118050.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118051.png" />, for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118052.png" />.
+
where the sum is over those $  k $,  
 +
$  1 \leq  k \leq  k _ {n} $,  
 +
for which $  {\mathsf D} X _ {n,k} < \sqrt {\alpha _ {n} } $.
  
This form of the statement is quite close to the one originally proposed by Lévy. Other formulations are possible (see, for example, [[#References|[8]]]), which in a certain sense are more reminiscent of the Lindeberg–Feller theorem.
+
This form of the statement is quite close to the one originally proposed by Lévy. Other formulations are possible (see, for example, {{Cite|R}}), which in a certain sense are more reminiscent of the Lindeberg–Feller theorem.
  
 
Nowadays this form of the central limit theorem can be obtained as a special case of a more general summation theorem on a triangular array without the condition of asymptotic negligibility.
 
Nowadays this form of the central limit theorem can be obtained as a special case of a more general summation theorem on a triangular array without the condition of asymptotic negligibility.
  
In practical respects it is important to have an idea of the rate of convergence of the distributions of the sums to the normal distribution. For this purpose there are inequalities and asymptotic expansions (and also the theory of [[Probability of large deviations|probability of large deviations]]; see also [[Cramér theorem|Cramér theorem]]; [[Limit theorems|Limit theorems]]). In what follows, for simplicity of the exposition, a triangular array is considered, and the variables participating in (1) are assumed to be identically distributed. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118053.png" />. A typical example of inequalities for the deviation of the distribution functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118054.png" /> of the normalized sum (2) from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118055.png" /> is the Berry–Esseen inequality: For all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118056.png" />,
+
In practical respects it is important to have an idea of the rate of convergence of the distributions of the sums to the normal distribution. For this purpose there are inequalities and asymptotic expansions (and also the theory of [[Probability of large deviations|probability of large deviations]]; see also [[Cramér theorem|Cramér theorem]]; [[Limit theorems|Limit theorems]]). In what follows, for simplicity of the exposition, a triangular array is considered, and the variables participating in (1) are assumed to be identically distributed. Let $  F ( x) = {\mathsf P} \{ X _ {k} < x \} $.  
 +
A typical example of inequalities for the deviation of the distribution functions $  F _ {n} ( x) $
 +
of the normalized sum (2) from $  \Phi ( x) $
 +
is the Berry–Esseen inequality: For all $  x $,
 +
 
 +
$$ \tag{7 }
 +
| F _ {n} ( x) - \Phi ( x) |  \leq  \
 +
C
 +
\frac{ {\mathsf E} | X _ {1} - a _ {1} |  ^ {3} }{\sigma _ {1}  ^ {3} }
 +
\cdot
 +
 
 +
\frac{1}{\sqrt n }
 +
,
 +
$$
 +
 
 +
where  $  C $
 +
is an absolute constant. (The best possible value of  $  C $
 +
is not known at present (1984); however, it does not exceed 0.7655.) Inequalities like (7) become less informative if the terms  $  X _ {k} $
 +
themselves are "almost normal" . Thus, if they are actually normal, then the left-hand side of (7) is zero, while the right-hand side is  $  C/ \sqrt {2 \pi } $.  
 +
Therefore, from the beginning of the 1960's onwards one proposed analogues of (7) in which on the right-hand side instead of the moments of the random variables  $  X _ {k} $
 +
other characteristics stand, similar to the moments but determined by the difference
 +
 
 +
$$
 +
F ( x) - \Phi \left (
 +
 
 +
\frac{x - a _ {1} }{\sigma _ {1} }
 +
\right )
 +
$$
 +
 
 +
in such a way that they become smaller, the smaller this difference. On the right-hand side of (7) and its generalizations one can put a function of  $  x $
 +
that decreases unboundedly as  $  | x | \rightarrow \infty $(
 +
so-called inhomogeneous estimators). One considers (see {{Cite|P}}) also other methods of measuring the "proximity" of  $  F _ {n} ( x) $
 +
to  $  \Phi ( x) $,
 +
for example, in the sense of the space  $  L _ {p} $(
 +
in so-called global versions of the central limit theorem) or methods based on a comparison of local characteristics of the distributions (see [[Local limit theorems|Local limit theorems]]).
 +
 
 +
The asymptotic expansion for the difference  $  F _ {n} ( x) - \Phi ( x) $
 +
has the form (see {{Cite|GK}}, {{Cite|Cr}}) for  $  \sigma = 1 $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118057.png" /></td> <td valign="top" style="width:5%;text-align:right;">(7)</td></tr></table>
+
$$
 +
F _ {n} ( x) - \Phi ( x) = \
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118058.png" /> is an absolute constant. (The best possible value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118059.png" /> is not known at present (1984); however, it does not exceed 0.7655.) Inequalities like (7) become less informative if the terms <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118060.png" /> themselves are "almost normal" . Thus, if they are actually normal, then the left-hand side of (7) is zero, while the right-hand side is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118061.png" />. Therefore, from the beginning of the 1960's onwards one proposed analogues of (7) in which on the right-hand side instead of the moments of the random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118062.png" /> other characteristics stand, similar to the moments but determined by the difference
+
\frac{e ^ {- x  ^ {2} /2 } }{\sqrt {2 \pi } }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118063.png" /></td> </tr></table>
+
\left (
  
in such a way that they become smaller, the smaller this difference. On the right-hand side of (7) and its generalizations one can put a function of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118064.png" /> that decreases unboundedly as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118065.png" /> (so-called inhomogeneous estimators). One considers (see [[#References|[6]]]) also other methods of measuring the "proximity" of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118066.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118067.png" />, for example, in the sense of the space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118068.png" /> (in so-called global versions of the central limit theorem) or methods based on a comparison of local characteristics of the distributions (see [[Local limit theorems|Local limit theorems]]).
+
\frac{Q _ {1} ( x) }{n  ^ {1/2} }
 +
+
  
The asymptotic expansion for the difference <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118069.png" /> has the form (see [[#References|[4]]], [[#References|[3]]]) for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118070.png" />:
+
\frac{Q _ {2} ( x) }{n}
 +
+
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118071.png" /></td> </tr></table>
+
\frac{Q _ {3} ( x) }{n  ^ {3/2} }
 +
+ \dots
 +
\right ) .
 +
$$
  
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118072.png" /> are polynomials of degree <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118073.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118074.png" /> with coefficients depending only on the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118075.png" /> moments of the terms. For the binomial distribution, the first term of the asymptotic expansion was indicated by P. Laplace in 1812, and, completely, but without a rigorous justification, the expansion was described by P.L. Chebyshev in 1887. The first estimate of the remainder, under the assumption that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118076.png" />-th moment <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118077.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118078.png" />, is finite and that
+
Here $  Q _ {k} ( x) $
 +
are polynomials of degree $  3k - 1 $
 +
in $  x $
 +
with coefficients depending only on the first $  k - 2 $
 +
moments of the terms. For the binomial distribution, the first term of the asymptotic expansion was indicated by P. Laplace in 1812, and, completely, but without a rigorous justification, the expansion was described by P.L. Chebyshev in 1887. The first estimate of the remainder, under the assumption that the $  s $-
 +
th moment $  \beta _ {s} = {\mathsf E} | X _ {k} |  ^ {2} $,  
 +
$  s \geq  3 $,  
 +
is finite and that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118079.png" /></td> </tr></table>
+
$$
 +
\overline{\lim\limits}\; _ {| t | \rightarrow \infty } \
 +
| {\mathsf E} e ^ {it X _ {k} } |  < 1,
 +
$$
  
 
the so-called Cramér condition, was given by Cramér in 1928. This result, in a somewhat stronger form, asserts that
 
the so-called Cramér condition, was given by Cramér in 1928. This result, in a somewhat stronger form, asserts that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118080.png" /></td> </tr></table>
+
$$
 +
F _ {n} ( x) - \Phi ( x)  = \
 +
 
 +
\frac{e ^ {- x  ^ {2} /2 } }{\sqrt {2 \pi } }
 +
 
 +
\left (
 +
 
 +
\frac{Q _ {1} ( x) }{\sqrt n }
 +
+ \dots +
 +
 
 +
\frac{Q _ {s - 2 }  ( x) }{n ^ {( s - 2)/2 } }
 +
\right . +
 +
$$
 +
 
 +
$$
 +
+ \left .
 +
o \left (
 +
\frac{1}{n ^ {( s - 2)/2 } }
 +
\right )  \right ) ,
 +
$$
 +
 
 +
uniformly in  $  x $.
 +
This asymptotic expansion serves as the basis for the construction of a broad class of transformations of random variables (cf. [[Random variables, transformations of|Random variables, transformations of]]).
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118081.png" /></td> </tr></table>
+
The central limit theorem can be extended to the case when (1) (or the triangular array generalizing it) is formed by vectors in the  $  m $-
 +
dimensional Euclidean space  $  \mathbf R  ^ {m} $.
 +
Suppose, for example, that the random vectors (1) are independent, identically distributed and with probability 1 do not lie in some hypersurface, and that  $  {\mathsf E} X _ {k} = 0 $
 +
and  $  {\mathsf E} \| X _ {k} \|  ^ {2} < \infty $
 +
with the usual Euclidean norm in  $  \mathbf R  ^ {m} $.  
 +
Under these conditions, as  $  n \rightarrow \infty $,
 +
the probability distributions of the normalized sums
  
uniformly in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118082.png" />. This asymptotic expansion serves as the basis for the construction of a broad class of transformations of random variables (cf. [[Random variables, transformations of|Random variables, transformations of]]).
+
$$
 +
Z _ {n} ^ { \prime }  = \
  
The central limit theorem can be extended to the case when (1) (or the triangular array generalizing it) is formed by vectors in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118083.png" />-dimensional Euclidean space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118084.png" />. Suppose, for example, that the random vectors (1) are independent, identically distributed and with probability 1 do not lie in some hypersurface, and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118085.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118086.png" /> with the usual Euclidean norm in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118087.png" />. Under these conditions, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118088.png" />, the probability distributions of the normalized sums
+
\frac{X _ {1} + \dots + X _ {n} }{\sqrt n }
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118089.png" /></td> </tr></table>
+
$$
  
converge weakly (see [[Convergence of distributions|Convergence of distributions]]) to the normal distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118090.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118091.png" /> with expectation equal to the zero vector and covariance matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118092.png" /> equal to that of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118093.png" />. Moreover, this convergence is uniform on broad classes of subsets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118094.png" /> (see [[#References|[10]]]). For example, it is uniform on the class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118095.png" /> of all convex Borel subsets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118096.png" />: As <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118097.png" />,
+
converge weakly (see [[Convergence of distributions|Convergence of distributions]]) to the normal distribution $  \Phi _  \Lambda  $
 +
in $  \mathbf R  ^ {m} $
 +
with expectation equal to the zero vector and covariance matrix $  \Lambda $
 +
equal to that of the $  X _ {k} $.  
 +
Moreover, this convergence is uniform on broad classes of subsets of $  \mathbf R  ^ {m} $(
 +
see {{Cite|BR}}). For example, it is uniform on the class $  \mathfrak C $
 +
of all convex Borel subsets of $  \mathbf R  ^ {m} $:  
 +
As $  n \rightarrow \infty $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118098.png" /></td> <td valign="top" style="width:5%;text-align:right;">(8)</td></tr></table>
+
$$ \tag{8 }
 +
\sup _ {\begin{array}{c}
 +
{} \\
 +
A \in \mathfrak C
 +
\end{array}
 +
} \
 +
| P _ {n} ( A) - \Phi _  \Lambda  ( A) |  \rightarrow  0.
 +
$$
  
 
Under additional assumptions the rate of the convergence (8) can be estimated.
 
Under additional assumptions the rate of the convergence (8) can be estimated.
  
The central limit theorem can also be extended to sequences (and arrays) of independent random vectors with values in infinite-dimensional spaces. The central limit theorem in the "customary" form need not hold. (Here the influence of the "geometry" of the space manifests itself, see [[Random element|Random element]].) Of special interest is the case when the terms of (1) take values in a separable Hilbert space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c02118099.png" />. The assertion quoted above on the weak convergence in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180100.png" /> of the distributions of the normalized sums <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180101.png" /> to the normal distribution remain verbally true in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180102.png" />. The convergence is uniform on comparatively narrow classes (for example, on the class of all balls with centre at the origin, or balls the centres of which lie in some fixed ball; the convergence on the class of all balls need not be uniform). Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180103.png" /> be the ball in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180104.png" /> of radius <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180105.png" /> with centre at the origin. Here an analogue of (7) is an inequality of the following type. Suppose that
+
The central limit theorem can also be extended to sequences (and arrays) of independent random vectors with values in infinite-dimensional spaces. The central limit theorem in the "customary" form need not hold. (Here the influence of the "geometry" of the space manifests itself, see [[Random element|Random element]].) Of special interest is the case when the terms of (1) take values in a separable Hilbert space $  H $.  
 +
The assertion quoted above on the weak convergence in $  \mathbf R  ^ {m} $
 +
of the distributions of the normalized sums $  Z _ {n} ^ { \prime } $
 +
to the normal distribution remain verbally true in $  H $.  
 +
The convergence is uniform on comparatively narrow classes (for example, on the class of all balls with centre at the origin, or balls the centres of which lie in some fixed ball; the convergence on the class of all balls need not be uniform). Let $  S _ {r} $
 +
be the ball in $  H $
 +
of radius $  r $
 +
with centre at the origin. Here an analogue of (7) is an inequality of the following type. Suppose that
 +
 
 +
$$
 +
{\mathsf E} X _ {k}  =  0,\ \
 +
{\mathsf E} \| X _ {k} \|  ^ {2}  <  \infty ,
 +
$$
 +
 
 +
and that the distribution of the  $  X _ {k} $
 +
is not concentrated on any finite-dimensional subspace of  $  H $;
 +
then in special cases (similar to the one analyzed in the example below)
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180106.png" /></td> </tr></table>
+
$$
 +
\Delta _ {n}  = \
 +
\sup _ { r } \
 +
| {\mathsf P}
 +
\{ Z _ {n} ^ { \prime } \in S _ {r} \} -
 +
\Phi _  \Lambda  ( S _ {r} ) |  = \
 +
O \left ( {
 +
\frac{1}{n}
 +
} \right ) .
 +
$$
  
and that the distribution of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180107.png" /> is not concentrated on any finite-dimensional subspace of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180108.png" />; then in special cases (similar to the one analyzed in the example below)
+
Under the condition  $  {\mathsf E} \| X _ {k} \| ^ {3 + \alpha } < \infty $,
 +
where  $  \alpha $
 +
is a fixed and not too-small number, it can be asserted that for any $  \epsilon > 0 $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180109.png" /></td> </tr></table>
+
$$
 +
\Delta _ {n}  = O
 +
\left (
 +
\frac{1}{n ^ {1 - \epsilon } }
  
Under the condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180110.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180111.png" /> is a fixed and not too-small number, it can be asserted that for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180112.png" />,
+
\right )
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180113.png" /></td> </tr></table>
+
(this is true, for example, when  $  \alpha = 1 $).
  
(this is true, for example, when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180114.png" />).
+
Quite specific problems, e.g. of mathematical statistics, may lead to a central limit theorem in infinite-dimensional spaces, in particular, in  $  H $.
  
Quite specific problems, e.g. of mathematical statistics, may lead to a central limit theorem in infinite-dimensional spaces, in particular, in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180115.png" />.
+
Example. Let  $  \theta _ {1} , \theta _ {2} \dots $
 +
be a sequence of independent random variables that are uniformly distributed on the interval  $  [ 0, 1] $.  
 +
Let  $  X _ {k} ( t) $,  
 +
$  k = 1, 2 \dots $
 +
be random elements in the space  $  L _ {2} [ 0, 1] $(
 +
the space of functions with integrable squares with respect to the Lebesgue measure on  $  [ 0, 1] $)
 +
given as follows:
  
Example. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180116.png" /> be a sequence of independent random variables that are uniformly distributed on the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180117.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180118.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180119.png" /> be random elements in the space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180120.png" /> (the space of functions with integrable squares with respect to the Lebesgue measure on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180121.png" />) given as follows:
+
$$
 +
X _ {k} ( t)  = \left \{
 +
\begin{array}{ll}
 +
- t  & \textrm{ for }  0 \leq  t \leq  \theta _ {k} , \\
 +
1- t  &\
 +
\textrm{ for }  \theta _ {k} < t \leq  1 . \\
 +
\end{array}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180122.png" /></td> </tr></table>
+
\right .$$
  
Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180123.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180124.png" />, and
+
Then $  {\mathsf E} X _ {k} ( t) = 0 $,  
 +
0 \leq  t \leq  1 $,  
 +
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180125.png" /></td> </tr></table>
+
$$
 +
Z _ {n} ^ { \prime } ( t)  = \
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180126.png" /> is the empirical distribution function constructed from the sample <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180127.png" /> of size <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180128.png" /> from a uniform distribution on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180129.png" />. Here the square of the norm,
+
\frac{X _ {1} ( t) + \dots + X _ {n} ( t) }{\sqrt n }
 +
  = \
 +
\sqrt n ( G _ {n} ( t) - t),
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180130.png" /></td> </tr></table>
+
where  $  G _ {n} ( t) $
 +
is the empirical distribution function constructed from the sample  $  \theta _ {1} \dots \theta _ {n} $
 +
of size  $  n $
 +
from a uniform distribution on  $  [ 0, 1] $.  
 +
Here the square of the norm,
  
coincides with the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180131.png" /> of the Cramér–von Mises–Smirnov test (see [[Cramér–von Mises test|Cramér–von Mises test]]). In accordance with the central limit theorem there exists a limit distribution for the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180132.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180133.png" />. It coincides with the distribution of the square of the norm of a certain normally-distributed vector in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180134.png" /> and is known as the [[Omega-squared distribution| "omega-squared" distribution]]. Thus, the central limit theorem justifies the replacement for large <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180135.png" /> of the distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180136.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180137.png" />, and this is at the basis of applications of the statistical tests mentioned above.
+
$$
 +
\| Z _ {n} ^ { \prime } \|  ^ {2}  = \
 +
\int\limits _ { 0 } ^ { 1 }
 +
( Z _ {n} ^ { \prime } ( t)) ^ {2} \
 +
dt  = n \int\limits _ { 0 } ^ { 1 }
 +
( H _ {n} ( t) - t)  ^ {2}  dt ,
 +
$$
  
Numerous versions are known of generalizations of the central limit theorem to sums of dependent variables. (In the case of homogeneous finite Markov chains, the simplest non-homogeneous chains with two states, and certain other schemes; this was done by Markov himself in 1907–1911, subsequent generalizations are connected in the first instance with the name of S.N. Bernshtein [[#References|[12]]].) A basic feature peculiar to all generalizations of this kind of the central limit theorem (if one is concerned with a triangular array) consists in the fact that the dependence between the events determined by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180138.png" />, and those determined by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180139.png" /> becomes vanishingly small when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c021/c021180/c021180140.png" /> grows indefinitely.
+
coincides with the statistic  $  \omega  ^ {2} $
 +
of the Cramér–von Mises–Smirnov test (see [[Cramér–von Mises test|Cramér–von Mises test]]). In accordance with the central limit theorem there exists a limit distribution for the  $  \omega _ {n}  ^ {2} $
 +
as  $  n \rightarrow \infty $.
 +
It coincides with the distribution of the square of the norm of a certain normally-distributed vector in  $  H $
 +
and is known as the [[Omega-squared distribution| "omega-squared" distribution]]. Thus, the central limit theorem justifies the replacement for large  $  n $
 +
of the distribution  $  \omega _ {n}  ^ {2} $
 +
by $  \omega  ^ {2} $,  
 +
and this is at the basis of applications of the statistical tests mentioned above.
  
As regards the methods of proof of the central limit theorem, in the case of independent terms the most powerful is, generally, the method of characteristic functions; it completes and occasionally replaces the so-called "method of compositions" (see [[#References|[11]]]) (and also the method known as that of "metric distances" ). In the case of dependent variables the most effective method, on the whole, is the method of semi-invariants (see, for example, [[#References|[14]]]). This method is suitable for the study of functions of random variables more general than sums or linear functions (for example, for quadratic and other forms).
+
Numerous versions are known of generalizations of the central limit theorem to sums of dependent variables. (In the case of homogeneous finite Markov chains, the simplest non-homogeneous chains with two states, and certain other schemes; this was done by Markov himself in 1907–1911, subsequent generalizations are connected in the first instance with the name of S.N. Bernshtein {{Cite|B}}.) A basic feature peculiar to all generalizations of this kind of the central limit theorem (if one is concerned with a triangular array) consists in the fact that the dependence between the events determined by  $  X _ {1} \dots X _ {k} $,
 +
and those determined by  $  X _ {k + p }  , X _ {k + p + 1 }  \dots $
 +
becomes vanishingly small when  $  p $
 +
grows indefinitely.
 +
 
 +
As regards the methods of proof of the central limit theorem, in the case of independent terms the most powerful is, generally, the method of characteristic functions; it completes and occasionally replaces the so-called "method of compositions" (see {{Cite|Sa}}) (and also the method known as that of "metric distances" ). In the case of dependent variables the most effective method, on the whole, is the method of semi-invariants (see, for example, {{Cite|St}}). This method is suitable for the study of functions of random variables more general than sums or linear functions (for example, for quadratic and other forms).
  
 
Concerning the central limit theorem in number theory see [[Number theory, probabilistic methods in|Number theory, probabilistic methods in]]. The central limit theorem is also applicable in certain problems in function theory and in the theory of dynamical systems.
 
Concerning the central limit theorem in number theory see [[Number theory, probabilistic methods in|Number theory, probabilistic methods in]]. The central limit theorem is also applicable in certain problems in function theory and in the theory of dynamical systems.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> B.V. Gnedenko, "A course of probability theory" , Moscow (1969) (In Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> W. Feller, "An introduction to probability theory and its applications" , '''1–2''' , Wiley (1957–1971) {{MR|0779091}} {{MR|0779090}} {{MR|0270403}} {{MR|0228020}} {{MR|1534302}} {{MR|0243559}} {{MR|0242202}} {{MR|0210154}} {{MR|1570945}} {{MR|0088081}} {{MR|1528130}} {{MR|0067380}} {{MR|0038583}} {{ZBL|0598.60003}} {{ZBL|0598.60002}} {{ZBL|0219.60003}} {{ZBL|0155.23101}} {{ZBL|0158.34902}} {{ZBL|0151.22403}} {{ZBL|0138.10207}} {{ZBL|0115.35308}} {{ZBL|0077.12201}} {{ZBL|0039.13201}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) {{MR|0016588}} {{ZBL|0063.01014}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> B.V. Gnedenko, A.N. Kolmogorov, "Limit distributions for sums of independent random variables" , Addison-Wesley (1954) (Translated from Russian) {{MR|0062975}} {{ZBL|0056.36001}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> I.A. Ibragimov, Yu.V. Linnik, "Independent and stationary sequences of random variables" , Wolters-Noordhoff (1971) (Translated from Russian) {{MR|0322926}} {{ZBL|0219.60027}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> V.V. Petrov, "Sums of independent random variables" , Springer (1975) (Translated from Russian) {{MR|0388499}} {{ZBL|0322.60043}} {{ZBL|0322.60042}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> V.M. Zolotarev, "A generalization of the Lindeberg–Feller theorem" ''Theory Probab. Appl.'' , '''12''' (1967) pp. 608–618 ''Teor. Veroyatnost. i Primenen.'' , '''12''' : 4 (1967) pp. 666–677 {{MR|0225367}} {{ZBL|0234.60031}} </TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> V.I. Rotar', "An extension of the Lindeberg–Feller theorem" ''Math. Notes'' , '''18''' (1975) pp. 123–128 ''Mat. Zametki'' , '''18''' : 1 (1975) pp. 129–135 {{MR|}} {{ZBL|0348.60025}} </TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> P.L. Chebyshev, "Selected works" , Moscow (1955) (In Russian)</TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> R.N. Bhattacharya, R. Ranga Rao, "Normal approximations and asymptotic expansions" , Wiley (1976) {{MR|0436272}} {{ZBL|}} </TD></TR><TR><TD valign="top">[11]</TD> <TD valign="top"> V.V. Sazonov, "Normal aproximation: some recent advances" , Springer (1981) (Translated from Russian)</TD></TR><TR><TD valign="top">[12]</TD> <TD valign="top"> S.N. Bernshtein, "Collected works" , '''4''' , Moscow (1964) (In Russian)</TD></TR><TR><TD valign="top">[13]</TD> <TD valign="top"> A.A. Markov, "Selected works" , Moscow-Leningrad (1951) (In Russian) {{MR|0050525}} {{ZBL|0054.00305}} </TD></TR><TR><TD valign="top">[14]</TD> <TD valign="top"> V.A. Statulyavichus, ''Teor. Veroyatnost. i Primenen.'' , '''5''' : 2 (1960) {{MR|2222750}} {{ZBL|}} </TD></TR><TR><TD valign="top">[15]</TD> <TD valign="top"> P. Lévy, "Théorie de l'addition des variables aléatoires" , Gauthier-Villars (1937)</TD></TR></table>
+
{|
 
+
|valign="top"|{{Ref|G}}|| B.V. Gnedenko,  [[Gnedenko, "A course in the theory of probability"|"A course of probability theory"]], Moscow (1969) (In Russian)
 
+
|-
 +
|valign="top"|{{Ref|F}}|| W. Feller, [[Feller, "An introduction to probability theory and its  applications"|"An introduction to probability theory and its   applications"]], '''1–2''' , Wiley (1957–1971)
 +
|-
 +
|valign="top"|{{Ref|Cr}}|| H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) {{MR|0016588}} {{ZBL|0063.01014}}
 +
|-
 +
|valign="top"|{{Ref|GK}}|| B.V. Gnedenko, A.N. Kolmogorov, "Limit distributions for sums of independent random variables" , Addison-Wesley (1954) (Translated from Russian) {{MR|0062975}} {{ZBL|0056.36001}}
 +
|-
 +
|valign="top"|{{Ref|IL}}|| I.A. Ibragimov, Yu.V. Linnik, "Independent and stationary sequences of random variables" , Wolters-Noordhoff (1971) (Translated from Russian) {{MR|0322926}} {{ZBL|0219.60027}}
 +
|-
 +
|valign="top"|{{Ref|P}}|| V.V. Petrov, "Sums of independent random variables" , Springer (1975) (Translated from Russian) {{MR|0388499}} {{ZBL|0322.60043}} {{ZBL|0322.60042}}
 +
|-
 +
|valign="top"|{{Ref|Z}}|| V.M. Zolotarev, "A generalization of the Lindeberg–Feller theorem" ''Theory Probab. Appl.'' , '''12''' (1967) pp. 608–618 ''Teor. Veroyatnost. i Primenen.'' , '''12''' : 4 (1967) pp. 666–677 {{MR|0225367}} {{ZBL|0234.60031}}
 +
|-
 +
|valign="top"|{{Ref|R}}|| V.I. Rotar', "An extension of the Lindeberg–Feller theorem" ''Math. Notes'' , '''18''' (1975) pp. 123–128 ''Mat. Zametki'' , '''18''' : 1 (1975) pp. 129–135 {{MR|}} {{ZBL|0348.60025}}
 +
|-
 +
|valign="top"|{{Ref|Ch}}|| P.L. Chebyshev, "Selected works" , Moscow (1955) (In Russian)
 +
|-
 +
|valign="top"|{{Ref|BR}}|| R.N. Bhattacharya, R. Ranga Rao, "Normal approximations and asymptotic expansions" , Wiley (1976) {{MR|0436272}} {{ZBL|}}
 +
|-
 +
|valign="top"|{{Ref|Sa}}|| V.V. Sazonov, "Normal aproximation: some recent advances" , Springer (1981) (Translated from Russian)
 +
|-
 +
|valign="top"|{{Ref|B}}|| S.N. Bernshtein, "Collected works" , '''4''' , Moscow (1964) (In Russian)
 +
|-
 +
|valign="top"|{{Ref|M}}|| A.A. Markov, "Selected works" , Moscow-Leningrad (1951) (In Russian) {{MR|0050525}} {{ZBL|0054.00305}}
 +
|-
 +
|valign="top"|{{Ref|St}}|| V.A. Statulyavichus, "?", ''Teor. Veroyatnost. i Primenen.'' , '''5''' : 2 (1960) {{MR|2222750}} {{ZBL|}}
 +
|-
 +
|valign="top"|{{Ref|Le}}|| P. Lévy, "Théorie de l'addition des variables aléatoires" , Gauthier-Villars (1937)
 +
|}
  
 
====Comments====
 
====Comments====
 
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> M. Loève, "Probability theory" , v. Nostrand (1963) {{MR|0203748}} {{ZBL|0108.14202}} </TD></TR></table>
+
{|
 +
|valign="top"|{{Ref|Lo}}|| M. Loève, "Probability theory" , v. Nostrand (1963) {{MR|0203748}} {{ZBL|0108.14202}}
 +
|}

Revision as of 16:43, 4 June 2020


2020 Mathematics Subject Classification: Primary: 60F05 [MSN][ZBL]

A common name for a number of limit theorems in probability theory stating conditions under which sums or other functions of a large number of independent or weakly-dependent random variables have a probability distribution close to the normal distribution.

The classical version of the central limit theorem is concerned with a sequence

$$ \tag{1 } X _ {1} \dots X _ {n} \dots $$

of independent random variables having finite (mathematical) expectations $ {\mathsf E} X _ {k} = a _ {k} $, and finite variances $ {\mathsf D} X _ {k} = b _ {k} $, and with the sums

$$ \tag{2 } S _ {n} = \ X _ {1} + \dots + X _ {n} . $$

Suppose that $ A _ {n} = {\mathsf E} S _ {n} = a _ {1} + \dots + a _ {n} $, $ B _ {n} = {\mathsf D} S _ {n} = b _ {1} + \dots + b _ {n} $. The distribution functions

$$ F _ {n} ( x) = \ {\mathsf P} \{ Z _ {n} < x \} , $$

that is, the "normalized" sums

$$ \tag{3 } Z _ {n} = \ \frac{S _ {n} - A _ {n} }{\sqrt {B _ {n} } } , $$

which have expectation 0 and variance 1, are compared with the "standard" normal distribution function

$$ \Phi ( x) = \ \frac{1}{\sqrt {2 \pi } } \int\limits _ {- \infty } ^ { x } e ^ {- z ^ {2} /2 } dz $$

corresponding to the normal distribution with expectation 0 and variance 1. In this case the central limit theorem asserts that under certain conditions, as $ n \rightarrow \infty $, for any $ x \in \mathbf R $,

$$ F _ {n} ( x) \rightarrow \Phi ( x), $$

or, what is the same, for any interval $ ( \alpha , \beta ) $:

$$ {\mathsf P} \{ \alpha < Z _ {n} < \beta \} = \ {\mathsf P} \{ A _ {n} + \alpha \sqrt {B _ {n} } < S _ {n} < A _ {n} + \beta \sqrt {B _ {n} } \} \rightarrow $$

$$ \rightarrow \ \Phi ( \beta ) - \Phi ( \alpha ), $$

(see Laplace theorem; Lyapunov theorem).

A clearer understanding of conditions for the emergence of a normal distribution as the limit of distributions of sums of independent random variables comes about by consisting a triangular array of random variables instead of a sequence (see [GK]). In this case one considers for every $ n = 1, 2 \dots $ a sequence of variables

$$ X _ {n,1} \dots X _ {n,n} , $$

putting

$$ X _ {n,k} = \ \frac{X _ {k} - a _ {k} }{\sqrt {B _ {n} } } ,\ \ 1 \leq k \leq n. $$

Then the random variables inside each sequence (row) are independent, and

$$ Z _ {n} = \ X _ {n,1} + \dots + X _ {n,n} . $$

The usual conditions for applicability of the central limit theorem (such as Lyapunov's condition or the condition of the Lindeberg–Feller theorem) imply that $ X _ {n,k} $ is asymptotically negligible. For example, from Lyapunov's condition with third moments, that is, from the condition that as $ n \rightarrow \infty $,

$$ \tag{4 } L _ {n} = \ \frac{1}{B _ {n} ^ {3/2} } \sum _ {n = 1 } ^ { n } {\mathsf E} | X _ {k} - a _ {k} | ^ {3} \rightarrow 0 , $$

for any $ \epsilon > 0 $ the inequality

$$ \max _ {1 \leq k \leq n } \ {\mathsf P} \{ | X _ {n,k} | > \epsilon \} = \ \max _ {1 \leq k \leq n } \ {\mathsf P} \{ | X _ {k} - a _ {k} | > \epsilon \sqrt {B _ {n} } \} \leq $$

$$ \leq \ \max _ {1 \leq k \leq n } \ \frac{1}{\epsilon ^ {3} B _ {n} ^ {3/2} } {\mathsf E} | X _ {k} - a _ {k} | ^ {3} \leq L _ {n} \rightarrow 0 $$

follows as $ n \rightarrow \infty $, and the fact that the quantities at the left-hand side of this chain of inequalities tend to zero indicates the asymptotic negligibility of the random variables forming the array.

Suppose now that

$$ \tag{5 } X _ {n,1} \dots X _ {n,k _ {n} } , $$

$ n = 1, 2 \dots $ is an arbitrary triangular array of asymptotically-negligible random variables that are independent within each sequence. If the limit distribution for the sums $ Z _ {n} = X _ {n,1} + \dots + X _ {n,k _ {n} } $ exists and is non-degenerate, then it is normal if and only if, as $ n \rightarrow \infty $, for any $ \epsilon > 0 $,

$$ \tag{6 } {\mathsf P} \left \{ \max _ {1 \leq k \leq k _ {n} } \ | X _ {n,k} | > \epsilon \right \} \rightarrow 0, $$

that is, if the maximal term in $ Z _ {n} $ becomes vanishingly small in comparison with the whole sum. (Without condition (6) one can only assert that the limit law for $ Z _ {n} $ belongs to the class of infinitely-divisible distributions, cf. Infinitely-divisible distribution.) Two additional conditions that together with (6) are necessary and sufficient for the convergence of the distributions of the sums $ Z _ {n} $ to a limit can be found in the article Triangular array.

When the condition of asymptotic negligibility of the variables in the triangular array considered above does not hold, the situation becomes complicated. The well-known theorem of H. Cramér that the sum of several independent random variables is normally distributed if and only if each of the summands is, makes it possible to assume (as P. Lévy did, see [Le], Chapt. 5, Theor. 38) that the sum of independent random variables has a distribution close to normal if the "large" terms are almost normal and if the collection of "small" terms is subject to the condition of "normality" of the distributions of the sums of the asymptotically-negligible terms. A precise form of an argument of this kind was first obtained for the triangular array (5) with $ {\mathsf E} X _ {n,k} = 0 $, $ \sum _ {n = 1 } ^ {k _ {n} } {\mathsf D} X _ {n,k} = 1 $( see [Z]). Here, for the convergence of the distribution functions $ F _ {n} ( x) = {\mathsf P} \{ Z _ {n} < x \} $ to the normal distribution function $ \Phi ( x) $ it is necessary and sufficient that the following two conditions hold simultaneously:

1) as $ n \rightarrow \infty $,

$$ \alpha _ {n} = \ \max _ {1 \leq k \leq k _ {n} } \ L ( F _ {n,k} , \Phi _ {n,k} ) \rightarrow 0, $$

where $ L ( F _ {n,k} , \Phi _ {n,k} ) $ is the Lévy distance (see Lévy metric) between the distribution functions $ F _ {n,k} ( x) $ of the random variables $ X _ {n,k} $ and the normal distribution functions $ \Phi _ {n,k} ( x) $ with the same expectation and variance as $ F _ {n,k} ( x) $; and

2) for any $ \epsilon > 0 $, as $ n \rightarrow \infty $,

$$ \Delta _ {n} ( \epsilon ) = \ \sum _ { k = 1 } ^ { {k _ n } } \int\limits _ {| x | > \epsilon } x ^ {2} dF _ {n,k} ( x) \rightarrow 0, $$

where the sum is over those $ k $, $ 1 \leq k \leq k _ {n} $, for which $ {\mathsf D} X _ {n,k} < \sqrt {\alpha _ {n} } $.

This form of the statement is quite close to the one originally proposed by Lévy. Other formulations are possible (see, for example, [R]), which in a certain sense are more reminiscent of the Lindeberg–Feller theorem.

Nowadays this form of the central limit theorem can be obtained as a special case of a more general summation theorem on a triangular array without the condition of asymptotic negligibility.

In practical respects it is important to have an idea of the rate of convergence of the distributions of the sums to the normal distribution. For this purpose there are inequalities and asymptotic expansions (and also the theory of probability of large deviations; see also Cramér theorem; Limit theorems). In what follows, for simplicity of the exposition, a triangular array is considered, and the variables participating in (1) are assumed to be identically distributed. Let $ F ( x) = {\mathsf P} \{ X _ {k} < x \} $. A typical example of inequalities for the deviation of the distribution functions $ F _ {n} ( x) $ of the normalized sum (2) from $ \Phi ( x) $ is the Berry–Esseen inequality: For all $ x $,

$$ \tag{7 } | F _ {n} ( x) - \Phi ( x) | \leq \ C \frac{ {\mathsf E} | X _ {1} - a _ {1} | ^ {3} }{\sigma _ {1} ^ {3} } \cdot \frac{1}{\sqrt n } , $$

where $ C $ is an absolute constant. (The best possible value of $ C $ is not known at present (1984); however, it does not exceed 0.7655.) Inequalities like (7) become less informative if the terms $ X _ {k} $ themselves are "almost normal" . Thus, if they are actually normal, then the left-hand side of (7) is zero, while the right-hand side is $ C/ \sqrt {2 \pi } $. Therefore, from the beginning of the 1960's onwards one proposed analogues of (7) in which on the right-hand side instead of the moments of the random variables $ X _ {k} $ other characteristics stand, similar to the moments but determined by the difference

$$ F ( x) - \Phi \left ( \frac{x - a _ {1} }{\sigma _ {1} } \right ) $$

in such a way that they become smaller, the smaller this difference. On the right-hand side of (7) and its generalizations one can put a function of $ x $ that decreases unboundedly as $ | x | \rightarrow \infty $( so-called inhomogeneous estimators). One considers (see [P]) also other methods of measuring the "proximity" of $ F _ {n} ( x) $ to $ \Phi ( x) $, for example, in the sense of the space $ L _ {p} $( in so-called global versions of the central limit theorem) or methods based on a comparison of local characteristics of the distributions (see Local limit theorems).

The asymptotic expansion for the difference $ F _ {n} ( x) - \Phi ( x) $ has the form (see [GK], [Cr]) for $ \sigma = 1 $:

$$ F _ {n} ( x) - \Phi ( x) = \ \frac{e ^ {- x ^ {2} /2 } }{\sqrt {2 \pi } } \left ( \frac{Q _ {1} ( x) }{n ^ {1/2} } + \frac{Q _ {2} ( x) }{n} + \frac{Q _ {3} ( x) }{n ^ {3/2} } + \dots \right ) . $$

Here $ Q _ {k} ( x) $ are polynomials of degree $ 3k - 1 $ in $ x $ with coefficients depending only on the first $ k - 2 $ moments of the terms. For the binomial distribution, the first term of the asymptotic expansion was indicated by P. Laplace in 1812, and, completely, but without a rigorous justification, the expansion was described by P.L. Chebyshev in 1887. The first estimate of the remainder, under the assumption that the $ s $- th moment $ \beta _ {s} = {\mathsf E} | X _ {k} | ^ {2} $, $ s \geq 3 $, is finite and that

$$ \overline{\lim\limits}\; _ {| t | \rightarrow \infty } \ | {\mathsf E} e ^ {it X _ {k} } | < 1, $$

the so-called Cramér condition, was given by Cramér in 1928. This result, in a somewhat stronger form, asserts that

$$ F _ {n} ( x) - \Phi ( x) = \ \frac{e ^ {- x ^ {2} /2 } }{\sqrt {2 \pi } } \left ( \frac{Q _ {1} ( x) }{\sqrt n } + \dots + \frac{Q _ {s - 2 } ( x) }{n ^ {( s - 2)/2 } } \right . + $$

$$ + \left . o \left ( \frac{1}{n ^ {( s - 2)/2 } } \right ) \right ) , $$

uniformly in $ x $. This asymptotic expansion serves as the basis for the construction of a broad class of transformations of random variables (cf. Random variables, transformations of).

The central limit theorem can be extended to the case when (1) (or the triangular array generalizing it) is formed by vectors in the $ m $- dimensional Euclidean space $ \mathbf R ^ {m} $. Suppose, for example, that the random vectors (1) are independent, identically distributed and with probability 1 do not lie in some hypersurface, and that $ {\mathsf E} X _ {k} = 0 $ and $ {\mathsf E} \| X _ {k} \| ^ {2} < \infty $ with the usual Euclidean norm in $ \mathbf R ^ {m} $. Under these conditions, as $ n \rightarrow \infty $, the probability distributions of the normalized sums

$$ Z _ {n} ^ { \prime } = \ \frac{X _ {1} + \dots + X _ {n} }{\sqrt n } $$

converge weakly (see Convergence of distributions) to the normal distribution $ \Phi _ \Lambda $ in $ \mathbf R ^ {m} $ with expectation equal to the zero vector and covariance matrix $ \Lambda $ equal to that of the $ X _ {k} $. Moreover, this convergence is uniform on broad classes of subsets of $ \mathbf R ^ {m} $( see [BR]). For example, it is uniform on the class $ \mathfrak C $ of all convex Borel subsets of $ \mathbf R ^ {m} $: As $ n \rightarrow \infty $,

$$ \tag{8 } \sup _ {\begin{array}{c} {} \\ A \in \mathfrak C \end{array} } \ | P _ {n} ( A) - \Phi _ \Lambda ( A) | \rightarrow 0. $$

Under additional assumptions the rate of the convergence (8) can be estimated.

The central limit theorem can also be extended to sequences (and arrays) of independent random vectors with values in infinite-dimensional spaces. The central limit theorem in the "customary" form need not hold. (Here the influence of the "geometry" of the space manifests itself, see Random element.) Of special interest is the case when the terms of (1) take values in a separable Hilbert space $ H $. The assertion quoted above on the weak convergence in $ \mathbf R ^ {m} $ of the distributions of the normalized sums $ Z _ {n} ^ { \prime } $ to the normal distribution remain verbally true in $ H $. The convergence is uniform on comparatively narrow classes (for example, on the class of all balls with centre at the origin, or balls the centres of which lie in some fixed ball; the convergence on the class of all balls need not be uniform). Let $ S _ {r} $ be the ball in $ H $ of radius $ r $ with centre at the origin. Here an analogue of (7) is an inequality of the following type. Suppose that

$$ {\mathsf E} X _ {k} = 0,\ \ {\mathsf E} \| X _ {k} \| ^ {2} < \infty , $$

and that the distribution of the $ X _ {k} $ is not concentrated on any finite-dimensional subspace of $ H $; then in special cases (similar to the one analyzed in the example below)

$$ \Delta _ {n} = \ \sup _ { r } \ | {\mathsf P} \{ Z _ {n} ^ { \prime } \in S _ {r} \} - \Phi _ \Lambda ( S _ {r} ) | = \ O \left ( { \frac{1}{n} } \right ) . $$

Under the condition $ {\mathsf E} \| X _ {k} \| ^ {3 + \alpha } < \infty $, where $ \alpha $ is a fixed and not too-small number, it can be asserted that for any $ \epsilon > 0 $,

$$ \Delta _ {n} = O \left ( \frac{1}{n ^ {1 - \epsilon } } \right ) $$

(this is true, for example, when $ \alpha = 1 $).

Quite specific problems, e.g. of mathematical statistics, may lead to a central limit theorem in infinite-dimensional spaces, in particular, in $ H $.

Example. Let $ \theta _ {1} , \theta _ {2} \dots $ be a sequence of independent random variables that are uniformly distributed on the interval $ [ 0, 1] $. Let $ X _ {k} ( t) $, $ k = 1, 2 \dots $ be random elements in the space $ L _ {2} [ 0, 1] $( the space of functions with integrable squares with respect to the Lebesgue measure on $ [ 0, 1] $) given as follows:

$$ X _ {k} ( t) = \left \{ \begin{array}{ll} - t & \textrm{ for } 0 \leq t \leq \theta _ {k} , \\ 1- t &\ \textrm{ for } \theta _ {k} < t \leq 1 . \\ \end{array} \right .$$

Then $ {\mathsf E} X _ {k} ( t) = 0 $, $ 0 \leq t \leq 1 $, and

$$ Z _ {n} ^ { \prime } ( t) = \ \frac{X _ {1} ( t) + \dots + X _ {n} ( t) }{\sqrt n } = \ \sqrt n ( G _ {n} ( t) - t), $$

where $ G _ {n} ( t) $ is the empirical distribution function constructed from the sample $ \theta _ {1} \dots \theta _ {n} $ of size $ n $ from a uniform distribution on $ [ 0, 1] $. Here the square of the norm,

$$ \| Z _ {n} ^ { \prime } \| ^ {2} = \ \int\limits _ { 0 } ^ { 1 } ( Z _ {n} ^ { \prime } ( t)) ^ {2} \ dt = n \int\limits _ { 0 } ^ { 1 } ( H _ {n} ( t) - t) ^ {2} dt , $$

coincides with the statistic $ \omega ^ {2} $ of the Cramér–von Mises–Smirnov test (see Cramér–von Mises test). In accordance with the central limit theorem there exists a limit distribution for the $ \omega _ {n} ^ {2} $ as $ n \rightarrow \infty $. It coincides with the distribution of the square of the norm of a certain normally-distributed vector in $ H $ and is known as the "omega-squared" distribution. Thus, the central limit theorem justifies the replacement for large $ n $ of the distribution $ \omega _ {n} ^ {2} $ by $ \omega ^ {2} $, and this is at the basis of applications of the statistical tests mentioned above.

Numerous versions are known of generalizations of the central limit theorem to sums of dependent variables. (In the case of homogeneous finite Markov chains, the simplest non-homogeneous chains with two states, and certain other schemes; this was done by Markov himself in 1907–1911, subsequent generalizations are connected in the first instance with the name of S.N. Bernshtein [B].) A basic feature peculiar to all generalizations of this kind of the central limit theorem (if one is concerned with a triangular array) consists in the fact that the dependence between the events determined by $ X _ {1} \dots X _ {k} $, and those determined by $ X _ {k + p } , X _ {k + p + 1 } \dots $ becomes vanishingly small when $ p $ grows indefinitely.

As regards the methods of proof of the central limit theorem, in the case of independent terms the most powerful is, generally, the method of characteristic functions; it completes and occasionally replaces the so-called "method of compositions" (see [Sa]) (and also the method known as that of "metric distances" ). In the case of dependent variables the most effective method, on the whole, is the method of semi-invariants (see, for example, [St]). This method is suitable for the study of functions of random variables more general than sums or linear functions (for example, for quadratic and other forms).

Concerning the central limit theorem in number theory see Number theory, probabilistic methods in. The central limit theorem is also applicable in certain problems in function theory and in the theory of dynamical systems.

References

[G] B.V. Gnedenko, "A course of probability theory", Moscow (1969) (In Russian)
[F] W. Feller, "An introduction to probability theory and its applications", 1–2 , Wiley (1957–1971)
[Cr] H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) MR0016588 Zbl 0063.01014
[GK] B.V. Gnedenko, A.N. Kolmogorov, "Limit distributions for sums of independent random variables" , Addison-Wesley (1954) (Translated from Russian) MR0062975 Zbl 0056.36001
[IL] I.A. Ibragimov, Yu.V. Linnik, "Independent and stationary sequences of random variables" , Wolters-Noordhoff (1971) (Translated from Russian) MR0322926 Zbl 0219.60027
[P] V.V. Petrov, "Sums of independent random variables" , Springer (1975) (Translated from Russian) MR0388499 Zbl 0322.60043 Zbl 0322.60042
[Z] V.M. Zolotarev, "A generalization of the Lindeberg–Feller theorem" Theory Probab. Appl. , 12 (1967) pp. 608–618 Teor. Veroyatnost. i Primenen. , 12 : 4 (1967) pp. 666–677 MR0225367 Zbl 0234.60031
[R] V.I. Rotar', "An extension of the Lindeberg–Feller theorem" Math. Notes , 18 (1975) pp. 123–128 Mat. Zametki , 18 : 1 (1975) pp. 129–135 Zbl 0348.60025
[Ch] P.L. Chebyshev, "Selected works" , Moscow (1955) (In Russian)
[BR] R.N. Bhattacharya, R. Ranga Rao, "Normal approximations and asymptotic expansions" , Wiley (1976) MR0436272
[Sa] V.V. Sazonov, "Normal aproximation: some recent advances" , Springer (1981) (Translated from Russian)
[B] S.N. Bernshtein, "Collected works" , 4 , Moscow (1964) (In Russian)
[M] A.A. Markov, "Selected works" , Moscow-Leningrad (1951) (In Russian) MR0050525 Zbl 0054.00305
[St] V.A. Statulyavichus, "?", Teor. Veroyatnost. i Primenen. , 5 : 2 (1960) MR2222750
[Le] P. Lévy, "Théorie de l'addition des variables aléatoires" , Gauthier-Villars (1937)

Comments

References

[Lo] M. Loève, "Probability theory" , v. Nostrand (1963) MR0203748 Zbl 0108.14202
How to Cite This Entry:
Central limit theorem. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Central_limit_theorem&oldid=23592
This article was adapted from an original article by Yu.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article