Difference between revisions of "Empirical process"

Latest revision as of 19:37, 5 June 2020

A stochastic process constructed from a sample and the corresponding probability measure. Let $ X _ {1} \dots X _ {n} , \dots $ be a sequence of independent random elements with common law $ P $, taking values in a measurable space $ ( S, {\mathcal S} ) $. The empirical measure $ P _ {n} $ of the first $ n $ $ X _ {i} $ s is the discrete random measure that places mass $ {1 / n } $ on each such $ X _ {i} $:

$$ P _ {n} ( C ) = { \frac{1}{n} } \# \left \{ {1 \leq i \leq n } : {X _ {i} \in C } \right \} , \quad C \in {\mathcal S}. $$

Obviously, $ n P _ {n} ( C ) $ is binomially distributed with parameters $ n $ and $ P ( C ) $( cf. Binomial distribution). Hence $ {\mathsf E} P _ {n} ( C ) = P ( C ) $, $ {\mathsf P} ( {\lim\limits } _ {n \rightarrow \infty } P _ {n} ( C ) = P ( C ) ) = 1 $, and $ \sqrt n ( P _ {n} ( C ) - P ( C ) ) $ converges in distribution, as $ n \rightarrow \infty $, to a centred normal random variable with variance $ P ( C ) ( 1 - P ( C ) ) $( cf. Convergence in distribution). Therefore it is natural to define an empirical process indexed by sets by

$$ \tag{a1 } \alpha _ {n} ( C ) = \sqrt n ( P _ {n} ( C ) - P ( C ) ) , \quad C \in {\mathcal C}, $$

where $ {\mathcal C} \subset {\mathcal S} $. If $ ( S, {\mathcal S} ) = ( \mathbf R, {\mathcal B} ) $ and $ {\mathcal C} = \{ {( - \infty,x ] } : {x \in \mathbf R } \} $, one writes $ F _ {n} ( x ) = P _ {n} ( ( - \infty,x ] ) $ for the empirical distribution function, and the empirical process specializes to the classical empirical process

$$ \tag{a2 } \alpha _ {n} ( x ) = \sqrt n ( F _ {n} ( x ) - F ( x ) ) , \quad x \in \mathbf R, $$

where $ F ( x ) = {\mathsf P} ( X _ {i} \leq x ) $, $ x \in \mathbf R $, is the distribution function of the elements $ X _ {i} $. Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:

$$ \tag{a3 } \alpha _ {n} ( f ) = \sqrt n ( P _ {n} ( f ) - P ( f ) ) , \quad f \in {\mathcal F}, $$

where

$$ P _ {n} ( f ) = \int\limits _ { S } f {d P _ {n} } = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } f ( X _ {i} ) , $$

$$ P ( f ) = \int\limits _ { S } f {d P } = {\mathsf E} f ( X _ {i} ) , $$

and $ {\mathcal F} $ is a suitable class of measurable functions from $ S $ to $ \mathbf R $.

The main aim of the theory of empirical processes is to obtain results uniformly in $ C $, $ x $ or $ f $; in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., Empirical distribution; Central limit theorem; Law of the iterated logarithm). (Measurability issues will be disregarded in the sequel.) The concept of a Vapnik–Chervonenkis class plays an important role in set-indexed situations. E.g., if $ {\mathcal C} $ is a Vapnik–Chervonenkis class, then for every probability measure $ P $ on $ ( S, {\mathcal S} ) $,

$$ \tag{a4 } \sup _ {C \in {\mathcal C} } \left | {P _ {n} ( C ) - P ( C ) } \right | \rightarrow 0 \textrm{ a.s. } , $$

and $ \alpha _ {n} ( C ) $, $ C \in {\mathcal C} $, converges weakly (see [a10] and Weak topology) to $ B _ {P} ( C ) $, $ C \in {\mathcal C} $, a centred, bounded Gaussian process, which is uniformly continuous (with respect to the pseudometric $ d $ defined by $ d ( C _ {1} ,C _ {2} ) = P ( C _ {1} \Delta C _ {2} ) $) and has covariance structure

$$ {\mathsf E} B _ {P} ( C _ {1} ) B _ {P} ( C _ {2} ) = P ( C _ {1} \cap C _ {2} ) - P ( C _ {1} ) P ( C _ {2} ) , $$

$$ C _ {1} ,C _ {2} \in {\mathcal C}. $$

For the classical empirical process in (a2), this limiting process specializes to $ B \circ F $, where $ B $ is a Brownian bridge (cf. Non-parametric methods in statistics). A sharp version of the first result is the following: (a4) holds if and only if

$$ \tag{a5 } {\mathsf P} roman \AAh {\lim\limits } { \frac{ { \mathop{\rm log} } \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) }{n} } = 0, $$

where

$$ \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) = \# \left \{ {C \cap \{ X _ {1} \dots X _ {n} \} } : {C \in {\mathcal C} } \right \} $$

(see Vapnik–Chervonenkis class). A corresponding sharp version of the central limit theorem exists too; essentially the only change is that the $ n $ in the denominator of (a5) has to be replaced by $ \sqrt n $ to obtain an "if and only if" condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of entropy, see [a12], [a13], [a9], [a10]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [a5], [a9], [a10].

For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges $ \{ {B _ {n} ( t ) } : {t \in [ 0,1 ] } \} $, $ n = 2,3, \dots $, can be constructed such that for all $ \lambda > 0 $

$$ {\mathsf P} \left ( \sup _ {x \in \mathbf R } \left | {\alpha _ {n} ( x ) - B _ {n} ( F ( x ) ) } \right | > { \frac{12 { \mathop{\rm log} } n + \lambda }{\sqrt n } } \right ) \leq 2e ^ {- {\lambda / 6 } } . $$

A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the $ B _ {n} $ s is known, i.e., the $ B _ {n} $ s are defined by means of one single Kiefer process, see [a3].

Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.

References

[a1]	K.S. Alexander, "Rates of growth and sample moduli for weighted empirical processes indexed by sets" Probab. Th. Rel. Fields , 75 (1987) pp. 379–423
[a2]	M. Csörgő, S. Csörgő, L. Horváth, D.M. Mason, "Weighted empirical and quantile processes" Ann. of Probab. , 14 (1986) pp. 31–85
[a3]	M. Csörgő, P. Révész, "Strong approximations in probability and statistics" , Acad. Press (1981)
[a4]	P. Deheuvels, D.M. Mason, "Functional laws of the iterated logarithm for the increments of empirical and quantile processes" Ann. of Probab. , 20 (1992) pp. 1248–1287
[a5]	R.M. Dudley, "Universal Donsker classes and metric entropy" Ann. of Probab. , 15 (1987) pp. 1306–1326
[a6]	J.H.J. Einmahl, "The a.s. behavior of the weighted empirical process and the LIL for the weighted tail empirical process" Ann. of Probab. , 20 (1992) pp. 681–695
[a7]	E. Giné, "Empirical processes and applications: an overview" Bernoulli , 2 (1996) pp. 1–28
[a8]	P. Massart, "The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality" Ann. of Probab. , 18 (1990) pp. 1269–1283
[a9]	D. Pollard, "Convergence of stochastic processes" , Springer (1984)
[a10]	A. Sheehy, J.A. Wellner, "Uniform Donsker classes of functions" Ann. of Probab. , 20 (1992) pp. 1983–2030
[a11]	G.R. Shorack, J.A. Wellner, "Empirical processes with applications to statistics" , Wiley (1986)
[a12]	K.S. Alexander, "Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 12 (1984) pp. 1041–1067
[a13]	K.S. Alexander, "Correction: Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 15 (1987) pp. 428–430

How to Cite This Entry:
Empirical process. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Empirical_process&oldid=12746

This article was adapted from an original article by J.H.J. Einmahl (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Empirical process"

Latest revision as of 19:37, 5 June 2020

References

@@ Line 1: / Line 1: @@
-A [[Stochastic process|stochastic process]] constructed from a [[Sample|sample]] and the corresponding [[Probability measure|probability measure]]. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100801.png" /> be a sequence of independent random elements with common law <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100802.png" />, taking values in a [[Measurable space|measurable space]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100803.png" />. The empirical measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100804.png" /> of the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100805.png" /> <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100806.png" />s is the discrete random measure that places mass <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100807.png" /> on each such <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100808.png" />:
+<!--
+e1100801.png
+$#A+1 = 59 n = 0
+$#C+1 = 59 : ~/encyclopedia/old_files/data/E110/E.1100080 Empirical process
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100809.png" /></td> </tr></table>
+{{TEX|auto}}
+{{TEX|done}}
-Obviously, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008010.png" /> is binomially distributed with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008011.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008012.png" /> (cf. [[Binomial distribution|Binomial distribution]]). Hence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008013.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008014.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008015.png" /> converges in distribution, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008016.png" />, to a centred normal random variable with variance <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008017.png" /> (cf. [[Convergence in distribution|Convergence in distribution]]). Therefore it is natural to define an empirical process indexed by sets by
+A [[Stochastic process|stochastic process]] constructed from a [[Sample|sample]] and the corresponding [[Probability measure|probability measure]]. Let  $  X _ {1} \dots X _ {n} , \dots $
+be a sequence of independent random elements with common law  $  P $,
+taking values in a [[Measurable space|measurable space]]  $  ( S, {\mathcal S} ) $.
+The empirical measure  $  P _ {n} $
+of the first  $  n $
+$  X _ {i} $
+s is the discrete random measure that places mass  $  {1 / n } $
+on each such  $  X _ {i} $:
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008018.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+$$
+P _ {n} ( C ) = {
+\frac{1}{n}
+ } \# \left \{ {1 \leq  i \leq  n } : {X _ {i} \in C } \right \} , \quad C \in {\mathcal S}.
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008019.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008020.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008021.png" />, one writes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008022.png" /> for the empirical distribution function, and the empirical process specializes to the classical empirical process
+Obviously,  $  n P _ {n} ( C ) $
+is binomially distributed with parameters  $  n $
+and  $  P ( C ) $(
+cf. [[Binomial distribution|Binomial distribution]]). Hence  $  {\mathsf E} P _ {n} ( C ) = P ( C ) $,
+$  {\mathsf P} ( {\lim\limits } _ {n \rightarrow \infty }  P _ {n} ( C ) = P ( C ) ) = 1 $,
+and  $  \sqrt n ( P _ {n} ( C ) - P ( C ) ) $
+converges in distribution, as  $  n \rightarrow \infty $,
+to a centred normal random variable with variance  $  P ( C ) ( 1 - P ( C ) ) $(
+cf. [[Convergence in distribution|Convergence in distribution]]). Therefore it is natural to define an empirical process indexed by sets by
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008023.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+$$ \tag{a1 }
+\alpha _ {n} ( C ) = \sqrt n ( P _ {n} ( C ) - P ( C ) ) , \quad C \in {\mathcal C},
+$$
-where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008024.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008025.png" />, is the distribution function of the elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008026.png" />. Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:
+where  $  {\mathcal C} \subset  {\mathcal S} $.
+If  $  ( S, {\mathcal S} ) = ( \mathbf R, {\mathcal B} ) $
+and  $  {\mathcal C} = \{ {( - \infty,x ] } : {x \in \mathbf R } \} $,
+one writes  $  F _ {n} ( x ) = P _ {n} ( ( - \infty,x ] ) $
+for the empirical distribution function, and the empirical process specializes to the classical empirical process
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008027.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table>
+$$ \tag{a2 }
+\alpha _ {n} ( x ) = \sqrt n ( F _ {n} ( x ) - F ( x ) ) , \quad x \in \mathbf R,
+$$
+where  $  F ( x ) = {\mathsf P} ( X _ {i} \leq  x ) $,
+$  x \in \mathbf R $,
+is the distribution function of the elements  $  X _ {i} $.
+Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:
+$$ \tag{a3 }
+\alpha _ {n} ( f ) = \sqrt n ( P _ {n} ( f ) - P ( f ) ) , \quad f \in {\mathcal F},
+$$
 where
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008028.png" /></td> </tr></table>
+$$
+P _ {n} ( f ) = \int\limits _ { S } f  {d P _ {n} } = {
+\frac{1}{n}
+ } \sum _ {i = 1 } ^ { n }  f ( X _ {i} ) ,
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008029.png" /></td> </tr></table>
+$$
+P ( f ) = \int\limits _ { S } f  {d P } = {\mathsf E} f ( X _ {i} ) ,
+$$
-and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008030.png" /> is a suitable class of measurable functions from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008031.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008032.png" />.
+and  $  {\mathcal F} $
+is a suitable class of measurable functions from  $  S $
+to  $  \mathbf R $.
-The main aim of the theory of empirical processes is to obtain results uniformly in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008033.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008034.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008035.png" />; in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., [[Empirical distribution|Empirical distribution]]; [[Central limit theorem|Central limit theorem]]; [[Law of the iterated logarithm|Law of the iterated logarithm]]). (Measurability issues will be disregarded in the sequel.) The concept of a [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]] plays an important role in set-indexed situations. E.g., if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008036.png" /> is a Vapnik–Chervonenkis class, then for every probability measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008037.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008038.png" />,
+The main aim of the theory of empirical processes is to obtain results uniformly in  $  C $,
+$  x $
+or  $  f $;
+in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., [[Empirical distribution|Empirical distribution]]; [[Central limit theorem|Central limit theorem]]; [[Law of the iterated logarithm|Law of the iterated logarithm]]). (Measurability issues will be disregarded in the sequel.) The concept of a [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]] plays an important role in set-indexed situations. E.g., if  $  {\mathcal C} $
+is a Vapnik–Chervonenkis class, then for every probability measure  $  P $
+on  $  ( S, {\mathcal S} ) $,
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008039.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a4)</td></tr></table>
+$$ \tag{a4 }
+ \sup  _ {C \in {\mathcal C} } \left | {P _ {n} ( C ) - P ( C ) } \right | \rightarrow 0  \textrm{ a.s.  } ,
+$$
-and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008040.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008041.png" />, converges weakly (see [[#References|[a10]]] and [[Weak topology|Weak topology]]) to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008043.png" />, a centred, bounded [[Gaussian process|Gaussian process]], which is uniformly continuous (with respect to the pseudometric <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008044.png" /> defined by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008045.png" />) and has covariance structure
+and  $  \alpha _ {n} ( C ) $,
+$  C \in {\mathcal C} $,
+converges weakly (see [[#References|[a10]]] and [[Weak topology|Weak topology]]) to  $  B _ {P} ( C ) $,
+$  C \in {\mathcal C} $,
+a centred, bounded [[Gaussian process|Gaussian process]], which is uniformly continuous (with respect to the pseudometric  $  d $
+defined by  $  d ( C _ {1} ,C _ {2} ) = P ( C _ {1} \Delta C _ {2} ) $)
+and has covariance structure
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008046.png" /></td> </tr></table>
+$$
+{\mathsf E} B _ {P} ( C _ {1} ) B _ {P} ( C _ {2} ) = P ( C _ {1} \cap C _ {2} ) - P ( C _ {1} ) P ( C _ {2} ) ,
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008047.png" /></td> </tr></table>
+$$
+C _ {1} ,C _ {2} \in {\mathcal C}.
+$$
-For the classical empirical process in (a2), this limiting process specializes to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008048.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008049.png" /> is a Brownian bridge (cf. [[Non-parametric methods in statistics|Non-parametric methods in statistics]]). A sharp version of the first result is the following: (a4) holds if and only if
+For the classical empirical process in (a2), this limiting process specializes to  $  B \circ F $,
+where  $  B $
+is a Brownian bridge (cf. [[Non-parametric methods in statistics|Non-parametric methods in statistics]]). A sharp version of the first result is the following: (a4) holds if and only if
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008050.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a5)</td></tr></table>
+$$ \tag{a5 }
+{\mathsf P} roman \AAh {\lim\limits } {
+\frac{ { \mathop{\rm log} } \Delta  ^  {\mathcal C}  ( X _ {1} \dots X _ {n} ) }{n}
+ } = 0,
+$$
 where
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008051.png" /></td> </tr></table>
+$$
+\Delta  ^  {\mathcal C}  ( X _ {1} \dots X _ {n} ) = \# \left \{ {C \cap \{ X _ {1} \dots X _ {n} \} } : {C \in {\mathcal C} } \right \}
+$$
-(see [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]]). A corresponding sharp version of the [[Central limit theorem|central limit theorem]] exists too; essentially the only change is that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008052.png" /> in the denominator of (a5) has to be replaced by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008053.png" /> to obtain an  "if and only if"  condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of [[Entropy|entropy]], see [[#References|[a12]]], [[#References|[a13]]], [[#References|[a9]]], [[#References|[a10]]]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [[#References|[a5]]], [[#References|[a9]]], [[#References|[a10]]].
+(see [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]]). A corresponding sharp version of the [[Central limit theorem|central limit theorem]] exists too; essentially the only change is that the  $  n $
+in the denominator of (a5) has to be replaced by  $  \sqrt n $
+to obtain an  "if and only if"  condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of [[Entropy|entropy]], see [[#References|[a12]]], [[#References|[a13]]], [[#References|[a9]]], [[#References|[a10]]]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [[#References|[a5]]], [[#References|[a9]]], [[#References|[a10]]].
-For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008054.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008055.png" />, can be constructed such that for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008056.png" />
+For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges  $  \{ {B _ {n} ( t ) } : {t \in [ 0,1 ] } \} $,
+$  n = 2,3, \dots $,
+can be constructed such that for all  $  \lambda > 0 $
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008057.png" /></td> </tr></table>
+$$
+{\mathsf P} \left (  \sup  _ {x \in \mathbf R } \left | {\alpha _ {n} ( x ) - B _ {n} ( F ( x ) ) } \right | > {
+\frac{12 { \mathop{\rm log} } n + \lambda }{\sqrt n }
+ } \right ) \leq  2e ^ {- {\lambda / 6 } } .
+$$
-A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008058.png" />s is known, i.e., the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008059.png" />s are defined by means of one single Kiefer process, see [[#References|[a3]]].
+A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the  $  B _ {n} $
+s is known, i.e., the  $  B _ {n} $
+s are defined by means of one single Kiefer process, see [[#References|[a3]]].
 Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.