Namespaces
Variants
Actions

Difference between revisions of "Empirical process"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
A [[Stochastic process|stochastic process]] constructed from a [[Sample|sample]] and the corresponding [[Probability measure|probability measure]]. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100801.png" /> be a sequence of independent random elements with common law <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100802.png" />, taking values in a [[Measurable space|measurable space]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100803.png" />. The empirical measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100804.png" /> of the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100805.png" /> <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100806.png" />s is the discrete random measure that places mass <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100807.png" /> on each such <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100808.png" />:
+
<!--
 +
e1100801.png
 +
$#A+1 = 59 n = 0
 +
$#C+1 = 59 : ~/encyclopedia/old_files/data/E110/E.1100080 Empirical process
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e1100809.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
Obviously, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008010.png" /> is binomially distributed with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008011.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008012.png" /> (cf. [[Binomial distribution|Binomial distribution]]). Hence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008013.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008014.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008015.png" /> converges in distribution, as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008016.png" />, to a centred normal random variable with variance <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008017.png" /> (cf. [[Convergence in distribution|Convergence in distribution]]). Therefore it is natural to define an empirical process indexed by sets by
+
A [[Stochastic process|stochastic process]] constructed from a [[Sample|sample]] and the corresponding [[Probability measure|probability measure]]. Let  $  X _ {1} \dots X _ {n} , \dots $
 +
be a sequence of independent random elements with common law  $  P $,  
 +
taking values in a [[Measurable space|measurable space]] $  ( S, {\mathcal S} ) $.  
 +
The empirical measure  $  P _ {n} $
 +
of the first  $  n $
 +
$  X _ {i} $
 +
s is the discrete random measure that places mass  $  {1 / n } $
 +
on each such  $  X _ {i} $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008018.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table>
+
$$
 +
P _ {n} ( C ) = {
 +
\frac{1}{n}
 +
} \# \left \{ {1 \leq  i \leq  n } : {X _ {i} \in C } \right \} , \quad C \in {\mathcal S}.
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008019.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008020.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008021.png" />, one writes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008022.png" /> for the empirical distribution function, and the empirical process specializes to the classical empirical process
+
Obviously,  $  n P _ {n} ( C ) $
 +
is binomially distributed with parameters  $  n $
 +
and  $  P ( C ) $(
 +
cf. [[Binomial distribution|Binomial distribution]]). Hence  $  {\mathsf E} P _ {n} ( C ) = P ( C ) $,
 +
$  {\mathsf P} ( {\lim\limits } _ {n \rightarrow \infty }  P _ {n} ( C ) = P ( C ) ) = 1 $,
 +
and $  \sqrt n ( P _ {n} ( C ) - P ( C ) ) $
 +
converges in distribution, as  $  n \rightarrow \infty $,  
 +
to a centred normal random variable with variance  $  P ( C ) ( 1 - P ( C ) ) $(
 +
cf. [[Convergence in distribution|Convergence in distribution]]). Therefore it is natural to define an empirical process indexed by sets by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008023.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table>
+
$$ \tag{a1 }
 +
\alpha _ {n} ( C ) = \sqrt n ( P _ {n} ( C ) - P ( C ) ) , \quad C \in {\mathcal C},
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008024.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008025.png" />, is the distribution function of the elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008026.png" />. Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:
+
where $  {\mathcal C} \subset  {\mathcal S} $.
 +
If  $  ( S, {\mathcal S} ) = ( \mathbf R, {\mathcal B} ) $
 +
and  $  {\mathcal C} = \{ {( - \infty,x ] } : {x \in \mathbf R } \} $,  
 +
one writes  $  F _ {n} ( x ) = P _ {n} ( ( - \infty,x ] ) $
 +
for the empirical distribution function, and the empirical process specializes to the classical empirical process
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008027.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table>
+
$$ \tag{a2 }
 +
\alpha _ {n} ( x ) = \sqrt n ( F _ {n} ( x ) - F ( x ) ) , \quad x \in \mathbf R,
 +
$$
 +
 
 +
where  $  F ( x ) = {\mathsf P} ( X _ {i} \leq  x ) $,
 +
$  x \in \mathbf R $,
 +
is the distribution function of the elements  $  X _ {i} $.
 +
Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:
 +
 
 +
$$ \tag{a3 }
 +
\alpha _ {n} ( f ) = \sqrt n ( P _ {n} ( f ) - P ( f ) ) , \quad f \in {\mathcal F},
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008028.png" /></td> </tr></table>
+
$$
 +
P _ {n} ( f ) = \int\limits _ { S } f  {d P _ {n} } = {
 +
\frac{1}{n}
 +
} \sum _ {i = 1 } ^ { n }  f ( X _ {i} ) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008029.png" /></td> </tr></table>
+
$$
 +
P ( f ) = \int\limits _ { S } f  {d P } = {\mathsf E} f ( X _ {i} ) ,
 +
$$
  
and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008030.png" /> is a suitable class of measurable functions from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008031.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008032.png" />.
+
and $  {\mathcal F} $
 +
is a suitable class of measurable functions from $  S $
 +
to $  \mathbf R $.
  
The main aim of the theory of empirical processes is to obtain results uniformly in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008033.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008034.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008035.png" />; in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., [[Empirical distribution|Empirical distribution]]; [[Central limit theorem|Central limit theorem]]; [[Law of the iterated logarithm|Law of the iterated logarithm]]). (Measurability issues will be disregarded in the sequel.) The concept of a [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]] plays an important role in set-indexed situations. E.g., if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008036.png" /> is a Vapnik–Chervonenkis class, then for every probability measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008037.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008038.png" />,
+
The main aim of the theory of empirical processes is to obtain results uniformly in $  C $,  
 +
$  x $
 +
or $  f $;  
 +
in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., [[Empirical distribution|Empirical distribution]]; [[Central limit theorem|Central limit theorem]]; [[Law of the iterated logarithm|Law of the iterated logarithm]]). (Measurability issues will be disregarded in the sequel.) The concept of a [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]] plays an important role in set-indexed situations. E.g., if $  {\mathcal C} $
 +
is a Vapnik–Chervonenkis class, then for every probability measure $  P $
 +
on $  ( S, {\mathcal S} ) $,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008039.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a4)</td></tr></table>
+
$$ \tag{a4 }
 +
\sup  _ {C \in {\mathcal C} } \left | {P _ {n} ( C ) - P ( C ) } \right | \rightarrow 0 \textrm{ a.s. } ,
 +
$$
  
and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008040.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008041.png" />, converges weakly (see [[#References|[a10]]] and [[Weak topology|Weak topology]]) to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008042.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008043.png" />, a centred, bounded [[Gaussian process|Gaussian process]], which is uniformly continuous (with respect to the pseudometric <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008044.png" /> defined by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008045.png" />) and has covariance structure
+
and $  \alpha _ {n} ( C ) $,  
 +
$  C \in {\mathcal C} $,  
 +
converges weakly (see [[#References|[a10]]] and [[Weak topology|Weak topology]]) to $  B _ {P} ( C ) $,  
 +
$  C \in {\mathcal C} $,  
 +
a centred, bounded [[Gaussian process|Gaussian process]], which is uniformly continuous (with respect to the pseudometric $  d $
 +
defined by $  d ( C _ {1} ,C _ {2} ) = P ( C _ {1} \Delta C _ {2} ) $)  
 +
and has covariance structure
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008046.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} B _ {P} ( C _ {1} ) B _ {P} ( C _ {2} ) = P ( C _ {1} \cap C _ {2} ) - P ( C _ {1} ) P ( C _ {2} ) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008047.png" /></td> </tr></table>
+
$$
 +
C _ {1} ,C _ {2} \in {\mathcal C}.
 +
$$
  
For the classical empirical process in (a2), this limiting process specializes to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008048.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008049.png" /> is a Brownian bridge (cf. [[Non-parametric methods in statistics|Non-parametric methods in statistics]]). A sharp version of the first result is the following: (a4) holds if and only if
+
For the classical empirical process in (a2), this limiting process specializes to $  B \circ F $,  
 +
where $  B $
 +
is a Brownian bridge (cf. [[Non-parametric methods in statistics|Non-parametric methods in statistics]]). A sharp version of the first result is the following: (a4) holds if and only if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008050.png" /></td> <td valign="top" style="width:5%;text-align:right;">(a5)</td></tr></table>
+
$$ \tag{a5 }
 +
{\mathsf P} roman \AAh {\lim\limits } {
 +
\frac{ { \mathop{\rm log} } \Delta  ^  {\mathcal C}  ( X _ {1} \dots X _ {n} ) }{n}
 +
} = 0,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008051.png" /></td> </tr></table>
+
$$
 +
\Delta  ^  {\mathcal C}  ( X _ {1} \dots X _ {n} ) = \# \left \{ {C \cap \{ X _ {1} \dots X _ {n} \} } : {C \in {\mathcal C} } \right \}
 +
$$
  
(see [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]]). A corresponding sharp version of the [[Central limit theorem|central limit theorem]] exists too; essentially the only change is that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008052.png" /> in the denominator of (a5) has to be replaced by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008053.png" /> to obtain an  "if and only if"  condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of [[Entropy|entropy]], see [[#References|[a12]]], [[#References|[a13]]], [[#References|[a9]]], [[#References|[a10]]]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [[#References|[a5]]], [[#References|[a9]]], [[#References|[a10]]].
+
(see [[Vapnik–Chervonenkis class|Vapnik–Chervonenkis class]]). A corresponding sharp version of the [[Central limit theorem|central limit theorem]] exists too; essentially the only change is that the $  n $
 +
in the denominator of (a5) has to be replaced by $  \sqrt n $
 +
to obtain an  "if and only if"  condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of [[Entropy|entropy]], see [[#References|[a12]]], [[#References|[a13]]], [[#References|[a9]]], [[#References|[a10]]]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [[#References|[a5]]], [[#References|[a9]]], [[#References|[a10]]].
  
For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008054.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008055.png" />, can be constructed such that for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008056.png" />
+
For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges $  \{ {B _ {n} ( t ) } : {t \in [ 0,1 ] } \} $,
 +
$  n = 2,3, \dots $,  
 +
can be constructed such that for all $  \lambda > 0 $
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008057.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} \left (  \sup  _ {x \in \mathbf R } \left | {\alpha _ {n} ( x ) - B _ {n} ( F ( x ) ) } \right | > {
 +
\frac{12 { \mathop{\rm log} } n + \lambda }{\sqrt n }
 +
} \right ) \leq  2e ^ {- {\lambda / 6 } } .
 +
$$
  
A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008058.png" />s is known, i.e., the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/e/e110/e110080/e11008059.png" />s are defined by means of one single Kiefer process, see [[#References|[a3]]].
+
A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the $  B _ {n} $
 +
s is known, i.e., the $  B _ {n} $
 +
s are defined by means of one single Kiefer process, see [[#References|[a3]]].
  
 
Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.
 
Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.

Latest revision as of 19:37, 5 June 2020


A stochastic process constructed from a sample and the corresponding probability measure. Let $ X _ {1} \dots X _ {n} , \dots $ be a sequence of independent random elements with common law $ P $, taking values in a measurable space $ ( S, {\mathcal S} ) $. The empirical measure $ P _ {n} $ of the first $ n $ $ X _ {i} $ s is the discrete random measure that places mass $ {1 / n } $ on each such $ X _ {i} $:

$$ P _ {n} ( C ) = { \frac{1}{n} } \# \left \{ {1 \leq i \leq n } : {X _ {i} \in C } \right \} , \quad C \in {\mathcal S}. $$

Obviously, $ n P _ {n} ( C ) $ is binomially distributed with parameters $ n $ and $ P ( C ) $( cf. Binomial distribution). Hence $ {\mathsf E} P _ {n} ( C ) = P ( C ) $, $ {\mathsf P} ( {\lim\limits } _ {n \rightarrow \infty } P _ {n} ( C ) = P ( C ) ) = 1 $, and $ \sqrt n ( P _ {n} ( C ) - P ( C ) ) $ converges in distribution, as $ n \rightarrow \infty $, to a centred normal random variable with variance $ P ( C ) ( 1 - P ( C ) ) $( cf. Convergence in distribution). Therefore it is natural to define an empirical process indexed by sets by

$$ \tag{a1 } \alpha _ {n} ( C ) = \sqrt n ( P _ {n} ( C ) - P ( C ) ) , \quad C \in {\mathcal C}, $$

where $ {\mathcal C} \subset {\mathcal S} $. If $ ( S, {\mathcal S} ) = ( \mathbf R, {\mathcal B} ) $ and $ {\mathcal C} = \{ {( - \infty,x ] } : {x \in \mathbf R } \} $, one writes $ F _ {n} ( x ) = P _ {n} ( ( - \infty,x ] ) $ for the empirical distribution function, and the empirical process specializes to the classical empirical process

$$ \tag{a2 } \alpha _ {n} ( x ) = \sqrt n ( F _ {n} ( x ) - F ( x ) ) , \quad x \in \mathbf R, $$

where $ F ( x ) = {\mathsf P} ( X _ {i} \leq x ) $, $ x \in \mathbf R $, is the distribution function of the elements $ X _ {i} $. Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:

$$ \tag{a3 } \alpha _ {n} ( f ) = \sqrt n ( P _ {n} ( f ) - P ( f ) ) , \quad f \in {\mathcal F}, $$

where

$$ P _ {n} ( f ) = \int\limits _ { S } f {d P _ {n} } = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } f ( X _ {i} ) , $$

$$ P ( f ) = \int\limits _ { S } f {d P } = {\mathsf E} f ( X _ {i} ) , $$

and $ {\mathcal F} $ is a suitable class of measurable functions from $ S $ to $ \mathbf R $.

The main aim of the theory of empirical processes is to obtain results uniformly in $ C $, $ x $ or $ f $; in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., Empirical distribution; Central limit theorem; Law of the iterated logarithm). (Measurability issues will be disregarded in the sequel.) The concept of a Vapnik–Chervonenkis class plays an important role in set-indexed situations. E.g., if $ {\mathcal C} $ is a Vapnik–Chervonenkis class, then for every probability measure $ P $ on $ ( S, {\mathcal S} ) $,

$$ \tag{a4 } \sup _ {C \in {\mathcal C} } \left | {P _ {n} ( C ) - P ( C ) } \right | \rightarrow 0 \textrm{ a.s. } , $$

and $ \alpha _ {n} ( C ) $, $ C \in {\mathcal C} $, converges weakly (see [a10] and Weak topology) to $ B _ {P} ( C ) $, $ C \in {\mathcal C} $, a centred, bounded Gaussian process, which is uniformly continuous (with respect to the pseudometric $ d $ defined by $ d ( C _ {1} ,C _ {2} ) = P ( C _ {1} \Delta C _ {2} ) $) and has covariance structure

$$ {\mathsf E} B _ {P} ( C _ {1} ) B _ {P} ( C _ {2} ) = P ( C _ {1} \cap C _ {2} ) - P ( C _ {1} ) P ( C _ {2} ) , $$

$$ C _ {1} ,C _ {2} \in {\mathcal C}. $$

For the classical empirical process in (a2), this limiting process specializes to $ B \circ F $, where $ B $ is a Brownian bridge (cf. Non-parametric methods in statistics). A sharp version of the first result is the following: (a4) holds if and only if

$$ \tag{a5 } {\mathsf P} roman \AAh {\lim\limits } { \frac{ { \mathop{\rm log} } \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) }{n} } = 0, $$

where

$$ \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) = \# \left \{ {C \cap \{ X _ {1} \dots X _ {n} \} } : {C \in {\mathcal C} } \right \} $$

(see Vapnik–Chervonenkis class). A corresponding sharp version of the central limit theorem exists too; essentially the only change is that the $ n $ in the denominator of (a5) has to be replaced by $ \sqrt n $ to obtain an "if and only if" condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of entropy, see [a12], [a13], [a9], [a10]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [a5], [a9], [a10].

For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges $ \{ {B _ {n} ( t ) } : {t \in [ 0,1 ] } \} $, $ n = 2,3, \dots $, can be constructed such that for all $ \lambda > 0 $

$$ {\mathsf P} \left ( \sup _ {x \in \mathbf R } \left | {\alpha _ {n} ( x ) - B _ {n} ( F ( x ) ) } \right | > { \frac{12 { \mathop{\rm log} } n + \lambda }{\sqrt n } } \right ) \leq 2e ^ {- {\lambda / 6 } } . $$

A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the $ B _ {n} $ s is known, i.e., the $ B _ {n} $ s are defined by means of one single Kiefer process, see [a3].

Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.

References

[a1] K.S. Alexander, "Rates of growth and sample moduli for weighted empirical processes indexed by sets" Probab. Th. Rel. Fields , 75 (1987) pp. 379–423
[a2] M. Csörgő, S. Csörgő, L. Horváth, D.M. Mason, "Weighted empirical and quantile processes" Ann. of Probab. , 14 (1986) pp. 31–85
[a3] M. Csörgő, P. Révész, "Strong approximations in probability and statistics" , Acad. Press (1981)
[a4] P. Deheuvels, D.M. Mason, "Functional laws of the iterated logarithm for the increments of empirical and quantile processes" Ann. of Probab. , 20 (1992) pp. 1248–1287
[a5] R.M. Dudley, "Universal Donsker classes and metric entropy" Ann. of Probab. , 15 (1987) pp. 1306–1326
[a6] J.H.J. Einmahl, "The a.s. behavior of the weighted empirical process and the LIL for the weighted tail empirical process" Ann. of Probab. , 20 (1992) pp. 681–695
[a7] E. Giné, "Empirical processes and applications: an overview" Bernoulli , 2 (1996) pp. 1–28
[a8] P. Massart, "The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality" Ann. of Probab. , 18 (1990) pp. 1269–1283
[a9] D. Pollard, "Convergence of stochastic processes" , Springer (1984)
[a10] A. Sheehy, J.A. Wellner, "Uniform Donsker classes of functions" Ann. of Probab. , 20 (1992) pp. 1983–2030
[a11] G.R. Shorack, J.A. Wellner, "Empirical processes with applications to statistics" , Wiley (1986)
[a12] K.S. Alexander, "Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 12 (1984) pp. 1041–1067
[a13] K.S. Alexander, "Correction: Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 15 (1987) pp. 428–430
How to Cite This Entry:
Empirical process. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Empirical_process&oldid=12746
This article was adapted from an original article by J.H.J. Einmahl (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article