Empirical process
A stochastic process constructed from a sample and the corresponding probability measure. Let $ X _ {1} \dots X _ {n} , \dots $
be a sequence of independent random elements with common law $ P $,
taking values in a measurable space $ ( S, {\mathcal S} ) $.
The empirical measure $ P _ {n} $
of the first $ n $
$ X _ {i} $
s is the discrete random measure that places mass $ {1 / n } $
on each such $ X _ {i} $:
$$ P _ {n} ( C ) = { \frac{1}{n} } \# \left \{ {1 \leq i \leq n } : {X _ {i} \in C } \right \} , \quad C \in {\mathcal S}. $$
Obviously, $ n P _ {n} ( C ) $ is binomially distributed with parameters $ n $ and $ P ( C ) $( cf. Binomial distribution). Hence $ {\mathsf E} P _ {n} ( C ) = P ( C ) $, $ {\mathsf P} ( {\lim\limits } _ {n \rightarrow \infty } P _ {n} ( C ) = P ( C ) ) = 1 $, and $ \sqrt n ( P _ {n} ( C ) - P ( C ) ) $ converges in distribution, as $ n \rightarrow \infty $, to a centred normal random variable with variance $ P ( C ) ( 1 - P ( C ) ) $( cf. Convergence in distribution). Therefore it is natural to define an empirical process indexed by sets by
$$ \tag{a1 } \alpha _ {n} ( C ) = \sqrt n ( P _ {n} ( C ) - P ( C ) ) , \quad C \in {\mathcal C}, $$
where $ {\mathcal C} \subset {\mathcal S} $. If $ ( S, {\mathcal S} ) = ( \mathbf R, {\mathcal B} ) $ and $ {\mathcal C} = \{ {( - \infty,x ] } : {x \in \mathbf R } \} $, one writes $ F _ {n} ( x ) = P _ {n} ( ( - \infty,x ] ) $ for the empirical distribution function, and the empirical process specializes to the classical empirical process
$$ \tag{a2 } \alpha _ {n} ( x ) = \sqrt n ( F _ {n} ( x ) - F ( x ) ) , \quad x \in \mathbf R, $$
where $ F ( x ) = {\mathsf P} ( X _ {i} \leq x ) $, $ x \in \mathbf R $, is the distribution function of the elements $ X _ {i} $. Replacing sets by their indicator functions leads, more generally, to the definition of an empirical process indexed by functions:
$$ \tag{a3 } \alpha _ {n} ( f ) = \sqrt n ( P _ {n} ( f ) - P ( f ) ) , \quad f \in {\mathcal F}, $$
where
$$ P _ {n} ( f ) = \int\limits _ { S } f {d P _ {n} } = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } f ( X _ {i} ) , $$
$$ P ( f ) = \int\limits _ { S } f {d P } = {\mathsf E} f ( X _ {i} ) , $$
and $ {\mathcal F} $ is a suitable class of measurable functions from $ S $ to $ \mathbf R $.
The main aim of the theory of empirical processes is to obtain results uniformly in $ C $, $ x $ or $ f $; in particular, Glivenko–Cantelli-type theorems, central limit theorems, laws of the iterated logarithm, and probability inequalities (cf., e.g., Empirical distribution; Central limit theorem; Law of the iterated logarithm). (Measurability issues will be disregarded in the sequel.) The concept of a Vapnik–Chervonenkis class plays an important role in set-indexed situations. E.g., if $ {\mathcal C} $ is a Vapnik–Chervonenkis class, then for every probability measure $ P $ on $ ( S, {\mathcal S} ) $,
$$ \tag{a4 } \sup _ {C \in {\mathcal C} } \left | {P _ {n} ( C ) - P ( C ) } \right | \rightarrow 0 \textrm{ a.s. } , $$
and $ \alpha _ {n} ( C ) $, $ C \in {\mathcal C} $, converges weakly (see [a10] and Weak topology) to $ B _ {P} ( C ) $, $ C \in {\mathcal C} $, a centred, bounded Gaussian process, which is uniformly continuous (with respect to the pseudometric $ d $ defined by $ d ( C _ {1} ,C _ {2} ) = P ( C _ {1} \Delta C _ {2} ) $) and has covariance structure
$$ {\mathsf E} B _ {P} ( C _ {1} ) B _ {P} ( C _ {2} ) = P ( C _ {1} \cap C _ {2} ) - P ( C _ {1} ) P ( C _ {2} ) , $$
$$ C _ {1} ,C _ {2} \in {\mathcal C}. $$
For the classical empirical process in (a2), this limiting process specializes to $ B \circ F $, where $ B $ is a Brownian bridge (cf. Non-parametric methods in statistics). A sharp version of the first result is the following: (a4) holds if and only if
$$ \tag{a5 } {\mathsf P} roman \AAh {\lim\limits } { \frac{ { \mathop{\rm log} } \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) }{n} } = 0, $$
where
$$ \Delta ^ {\mathcal C} ( X _ {1} \dots X _ {n} ) = \# \left \{ {C \cap \{ X _ {1} \dots X _ {n} \} } : {C \in {\mathcal C} } \right \} $$
(see Vapnik–Chervonenkis class). A corresponding sharp version of the central limit theorem exists too; essentially the only change is that the $ n $ in the denominator of (a5) has to be replaced by $ \sqrt n $ to obtain an "if and only if" condition for the central limit theorem. Other useful concepts in connection with empirical processes are various notions of entropy, see [a12], [a13], [a9], [a10]. Also, for the function-indexed process in (a3), the analogues of (a4) and the central limit theorem above have been studied thoroughly, see [a5], [a9], [a10].
For the classical empirical process in (a2), approximation theorems which yield a rate of convergence in the central limit theorem are extremely useful: A sequence of Brownian bridges $ \{ {B _ {n} ( t ) } : {t \in [ 0,1 ] } \} $, $ n = 2,3, \dots $, can be constructed such that for all $ \lambda > 0 $
$$ {\mathsf P} \left ( \sup _ {x \in \mathbf R } \left | {\alpha _ {n} ( x ) - B _ {n} ( F ( x ) ) } \right | > { \frac{12 { \mathop{\rm log} } n + \lambda }{\sqrt n } } \right ) \leq 2e ^ {- {\lambda / 6 } } . $$
A similar, only slightly less sharp, result can be obtained for the situation where the joint distribution of the $ B _ {n} $ s is known, i.e., the $ B _ {n} $ s are defined by means of one single Kiefer process, see [a3].
Empirical and related processes have many applications in many different subfields of probability theory and (non-parametric) statistics.
References
[a1] | K.S. Alexander, "Rates of growth and sample moduli for weighted empirical processes indexed by sets" Probab. Th. Rel. Fields , 75 (1987) pp. 379–423 |
[a2] | M. Csörgő, S. Csörgő, L. Horváth, D.M. Mason, "Weighted empirical and quantile processes" Ann. of Probab. , 14 (1986) pp. 31–85 |
[a3] | M. Csörgő, P. Révész, "Strong approximations in probability and statistics" , Acad. Press (1981) |
[a4] | P. Deheuvels, D.M. Mason, "Functional laws of the iterated logarithm for the increments of empirical and quantile processes" Ann. of Probab. , 20 (1992) pp. 1248–1287 |
[a5] | R.M. Dudley, "Universal Donsker classes and metric entropy" Ann. of Probab. , 15 (1987) pp. 1306–1326 |
[a6] | J.H.J. Einmahl, "The a.s. behavior of the weighted empirical process and the LIL for the weighted tail empirical process" Ann. of Probab. , 20 (1992) pp. 681–695 |
[a7] | E. Giné, "Empirical processes and applications: an overview" Bernoulli , 2 (1996) pp. 1–28 |
[a8] | P. Massart, "The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality" Ann. of Probab. , 18 (1990) pp. 1269–1283 |
[a9] | D. Pollard, "Convergence of stochastic processes" , Springer (1984) |
[a10] | A. Sheehy, J.A. Wellner, "Uniform Donsker classes of functions" Ann. of Probab. , 20 (1992) pp. 1983–2030 |
[a11] | G.R. Shorack, J.A. Wellner, "Empirical processes with applications to statistics" , Wiley (1986) |
[a12] | K.S. Alexander, "Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 12 (1984) pp. 1041–1067 |
[a13] | K.S. Alexander, "Correction: Probability inequalities for empirical processes and a law of the iterated logarithm" Ann. of Probab. , 15 (1987) pp. 428–430 |
Empirical process. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Empirical_process&oldid=46819