# U-statistic

A sum

$$U _ {n} ^ {m} ( \Phi ) = \left ( \begin{array}{c} n \\ m \end{array} \right ) ^ {- 1 } \sum _ {1 \leq i _ {1} < \dots < i _ {m} \leq n } \Phi ( X _ {i _ {1} } \dots X _ {i _ {m} } ) .$$

Hoeffding's form for $U$- statistics is [a1]:

$$U _ {n} ^ {m} ( \Phi ) : = n ^ {- [ m ] } \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \Phi ( X _ {j _ {1} } \dots X _ {j _ {m} } ) .$$

The kernel of a $U$- statistic, $\Phi : {X ^ {m} } \rightarrow \mathbf R$, is a symmetric real-valued function of $m$ variables. The random variables $X _ {1} \dots X _ {n}$( cf. also Random variable) are independent identically distributed with common distribution function ${\mathsf P} ( A )$ on a measurable space $( X, {\mathcal X} )$, $A \in {\mathcal X}$. The number $m \leq n$ is called the degree of the $U$- statistic. The number of terms in the sum is equal to

$$\left ( \begin{array}{c} n \\ m \end{array} \right ) = { \frac{n! }{m! ( n - m ) ! } }$$

in the first sum and to

$$n ^ {[ m ] } = { \frac{n! }{( n - m ) ! } } = n ( n - 1 ) \dots ( n - m + 1 )$$

in the second sum. Also, $n ^ {- [ m ] } = {1 / {n ^ {[ m ] } } }$.

Various statistics can be represented as $U$- statistics or can be approximated by $U$- statistics with a suitable choice of the kernel $\Phi$. For example, the sampling variance

$$S _ {n} = { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - {\overline{x}\; } ) ^ {2} = U _ {n} ^ {2} ( \Phi )$$

can be obtained using the kernel $\Phi ( x _ {1} , x _ {2} ) = { {( x _ {1} - x _ {2} ) ^ {2} } / 2 }$. Here,

$${\overline{x}\; } = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } X _ {i}$$

is the mean value of the sample. The von Mises functional, given by

$$V _ {n} ^ {m} ( \Phi ) = n ^ {- m } \sum _ {( i _ {1} \dots i _ {m} ) = 1 } ^ { n } \Phi ( X _ {i _ {1} } \dots X _ {i _ {m} } ) =$$

$$= \int\limits _ {X ^ {m} } {\Phi ( x _ {1} \dots x _ {m} ) \prod _ {i = 1 } ^ { m } } {\Pi _ {n} ( dx _ {i} ) } ,$$

where

$$\Pi _ {n} ( dx ) = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } \delta _ {X _ {i} } ( dx )$$

is the empirical distribution, can be represented by a linear combination of $U$- statistics [a2]. For the primitive kernel $\Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {i = 1 } ^ {m} \phi ( x _ {i} )$, the $U$- statistic

$$U _ {n} ^ {m} ( \Phi _ {m} ) = n ^ {- [ m ] } \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \prod _ {c = 1 } ^ { m } \varphi ( x _ {c} )$$

is a symmetric polynomial statistic of the random variables $y _ {k} = \varphi ( x _ {k} )$, $1 \leq k \leq n$.

The starting point of the analysis of $U$- statistics is the Hoeffding decomposition of $U$- statistics, [a1]:

$$U _ {n} ^ {m} ( \Phi ) = {\mathsf E} \Phi + \sum _ {c = r } ^ { m } \left ( \begin{array}{c} m \\ c \end{array} \right ) U _ {n} ^ {c} ( g _ {c} ) ,$$

where $g _ {c} = g _ {c} ( x _ {1} \dots x _ {c} )$, $r \leq c \leq m$, are completely degenerate kernels: ${\mathsf E} g _ {c} ( X _ {1} \dots X _ {c} ) = 0$. The integer $r \geq 1$ is called the rank of the $U$- statistic. Here, by definition, ${\mathsf E} \Phi$ is the mean value of the kernel and, also, ${\mathsf E} U _ {n} ^ {m} ( \Phi ) = {\mathsf E} \Phi$. Therefore, an $U$- statistic is an unbiased estimator of the functional $\theta = {\mathsf E} \Phi$.

The theory of $U$- statistics, founded by W. Hoeffding in the seminal work [a1], published in 1948, was developed under the impact of the theory of sums of independent random variables. The law of large numbers, the central limit theorem, the law of the iterated logarithm, etc. were investigated in various works (see the references in [a3]). The asymptotic behaviour of $U$- statistics can be reduced to the analysis of sums of independent identically distributed random variables. For a non-degenerate kernel $\Phi$ with ${\mathsf E} \Phi = 0$ and ${\mathsf E} | \Phi | ^ {4/3 } < \infty$ there is weak convergence (as $n \rightarrow \infty$; cf. also Convergence, types of)):

$${ \frac{\sqrt n U _ {n} ^ {m} ( \Phi ) }{m \sigma } } \Rightarrow \tau,$$

where $\tau$ is a random variable with standard Gaussian distribution with ${\mathsf E} \tau = 0$ and ${\mathsf E} \tau ^ {2} = 1$. Here, $\sigma ^ {2} = {\mathsf E} g _ {1} ^ {2}$.

For $r \geq 2$ the limit distribution of $U$- statistics depends essentially on the kernel. For a primitive completely degenerate kernel

$$\Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {c = 1 } ^ { m } \varphi ( x _ {c} )$$

with ${\mathsf E} \varphi ( X _ {1} ) = 0$ and ${\mathsf E} \varphi ^ {2} ( X _ {1} ) = \sigma ^ {2} < \infty$, there is weak convergence (as $n \rightarrow \infty$):

$${ \frac{n ^ {m/2 } U _ {n} ^ {m} ( \Phi _ {m} ) }{\sigma ^ {m} } } \Rightarrow H _ {m} ( \tau ) ,$$

where $H _ {m} ( x )$ is the Hermite polynomial of degree $m$[a7] (cf. also Hermite polynomials).

$U$- statistics with completely degenerate kernel, ${\mathsf E} g ( x _ {1} \dots x _ {m} ) = 0$ and ${\mathsf E} g ^ {2} < \infty$, converge weakly (as $n \rightarrow \infty$) to the Itô–Wiener stochastic integral [a3], [a5]:

$$n ^ {m/2 } U _ {n} ^ {m} ( g ) \Rightarrow \int\limits _ {X ^ {m} } {g ( x _ {1} \dots x _ {m} ) \prod _ {c = 1 } ^ { m } } {W ( dx _ {c} ) } .$$

$U$- statistics can also be represented by the stochastic integral with respect to the permanent random measure, as follows, [a3],

$$U _ {n} ^ {m} ( g ) =$$

$$= n ^ {- [ m ] } \int\limits _ {X ^ {m} } {g ( x _ {1} \dots x _ {m} ) } {\Delta _ {n} ^ {m} ( dx _ {1} \dots dx _ {m} ) } ,$$

where

$$\Delta _ {n} ^ {m} ( dx _ {1} \dots dx _ {m} ) =$$

$$= \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \prod _ {c = 1 } ^ { m } [ \delta _ {x _ {j _ {c} } } ( dx _ {c} ) - P ( dx _ {c} ) ] .$$

The asymptotic analysis of $U$- statistics is based on the martingale structure of $U$- statistics and involves functional limit theorems, rate of convergence, almost sure convergence, asymptotic expansions, and probability of large deviations.

The contemporary development of the theory of $U$- statistics contains various generalizations: $U$- statistics with kernel taking values in a Hilbert or Banach space [a8], multi-sampling $U$- statistics, bootstrap and truncated $U$- statistics, weighted $U$- statistics, etc. $U$- statistics with kernel depending on $n$ are used in non-parametric density and regression estimation [a2], [a3], [a4], [a5], [a6].

#### References

 [a1] W. Hoeffding, "A class of statistics with asymptotically normal distribution" Ann. Math. Stat. , 19 (1948) pp. 293–325 [a2] A.J. Lee, "U-statistics. Theory and practice" , Statistics textbooks and monographs , 110 , M. Dekker (1990) [a3] V.S. Korolyuk, Yu.V. Borovskikh, "Theory of U-statistics" , Kluwer Acad. Publ. (1994) (In Russian) [a4] R.J. Serfling, "Approximation: theorems of mathematical statistics" , Wiley (1980) [a5] E.B. Dynkin, A. Mandelbaum, "Symmetric statistics, Poisson point process and multiple Wiener integrals" Ann. Stat. , 11 (1983) pp. 739–745 [a6] M. Denker, "Asymptotic theory in nonparametric statistics" , Vieweg (1985) [a7] V.S. Korolyuk, Yu.V. Borovskikh, "Random permanents" , VSP (1994) (In Russian) [a8] Yu.V. Borovskikh, "U-statistics in Banach space" , VSP (1995) (In Russian)
How to Cite This Entry:
U-statistic. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=U-statistic&oldid=49059
This article was adapted from an original article by V.S. Korolyuk (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article