Namespaces
Variants
Actions

Difference between revisions of "U-statistic"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
 +
<!--
 +
u1100102.png
 +
$#A+1 = 81 n = 0
 +
$#C+1 = 81 : ~/encyclopedia/old_files/data/U110/U.1100010 \BMI U\EMI\AAhstatistic
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
A sum
 
A sum
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100102.png" /></td> </tr></table>
+
$$
 +
U _ {n}  ^ {m} ( \Phi ) = \left ( \begin{array}{c}
 +
n \\
 +
m
 +
\end{array}
 +
\right ) ^ {- 1 } \sum _ {1 \leq  i _ {1} < \dots < i _ {m} \leq  n } \Phi ( X _ {i _ {1}  } \dots X _ {i _ {m}  } ) .
 +
$$
  
Hoeffding's form for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100104.png" />-statistics is [[#References|[a1]]]:
+
Hoeffding's form for $  U $-
 +
statistics is [[#References|[a1]]]:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100105.png" /></td> </tr></table>
+
$$
 +
U _ {n}  ^ {m} ( \Phi ) : = n ^ {- [ m ] } \sum _ {1 \leq  j _ {1} \neq \dots \neq j _ {m} \leq  n } \Phi ( X _ {j _ {1}  } \dots X _ {j _ {m}  } ) .
 +
$$
  
The kernel of a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100107.png" />-statistic, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100108.png" />, is a symmetric real-valued function of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u1100109.png" /> variables. The random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001010.png" /> (cf. also [[Random variable|Random variable]]) are independent identically distributed with common distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001011.png" /> on a [[Measurable space|measurable space]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001012.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001013.png" />. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001014.png" /> is called the degree of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001016.png" />-statistic. The number of terms in the sum is equal to
+
The kernel of a $  U $-
 +
statistic, $  \Phi : {X  ^ {m} } \rightarrow \mathbf R $,  
 +
is a symmetric real-valued function of $  m $
 +
variables. The random variables $  X _ {1} \dots X _ {n} $(
 +
cf. also [[Random variable|Random variable]]) are independent identically distributed with common distribution function $  {\mathsf P} ( A ) $
 +
on a [[Measurable space|measurable space]] $  ( X, {\mathcal X} ) $,  
 +
$  A \in {\mathcal X} $.  
 +
The number $  m \leq  n $
 +
is called the degree of the $  U $-
 +
statistic. The number of terms in the sum is equal to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001017.png" /></td> </tr></table>
+
$$
 +
\left ( \begin{array}{c}
 +
n \\
 +
m
 +
\end{array}
 +
\right ) = {
 +
\frac{n! }{m! ( n - m ) ! }
 +
}
 +
$$
  
 
in the first sum and to
 
in the first sum and to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001018.png" /></td> </tr></table>
+
$$
 +
n ^ {[ m ] } = {
 +
\frac{n! }{( n - m ) ! }
 +
} = n ( n - 1 ) \dots ( n - m + 1 )
 +
$$
  
in the second sum. Also, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001019.png" />.
+
in the second sum. Also, $  n ^ {- [ m ] } = {1 / {n ^ {[ m ] } } } $.
  
Various statistics can be represented as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001020.png" />-statistics or can be approximated by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001021.png" />-statistics with a suitable choice of the kernel <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001022.png" />. For example, the sampling variance
+
Various statistics can be represented as $  U $-
 +
statistics or can be approximated by $  U $-
 +
statistics with a suitable choice of the kernel $  \Phi $.  
 +
For example, the sampling variance
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001023.png" /></td> </tr></table>
+
$$
 +
S _ {n} = {
 +
\frac{1}{n - 1 }
 +
} \sum _ {i = 1 } ^ { n }  ( X _ {i} - {\overline{x}\; } )  ^ {2} = U _ {n}  ^ {2} ( \Phi )
 +
$$
  
can be obtained using the kernel <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001024.png" />. Here,
+
can be obtained using the kernel $  \Phi ( x _ {1} , x _ {2} ) = { {( x _ {1} - x _ {2} )  ^ {2} } / 2 } $.  
 +
Here,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001025.png" /></td> </tr></table>
+
$$
 +
{\overline{x}\; } = {
 +
\frac{1}{n}
 +
} \sum _ {i = 1 } ^ { n }  X _ {i}  $$
  
 
is the mean value of the sample. The von Mises functional, given by
 
is the mean value of the sample. The von Mises functional, given by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001026.png" /></td> </tr></table>
+
$$
 +
V _ {n}  ^ {m} ( \Phi ) = n ^ {- m } \sum _ {( i _ {1} \dots i _ {m} ) = 1 } ^ { n }  \Phi ( X _ {i _ {1}  } \dots X _ {i _ {m}  } ) =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001027.png" /></td> </tr></table>
+
$$
 +
=  
 +
\int\limits _ {X  ^ {m} } {\Phi ( x _ {1} \dots x _ {m} ) \prod _ {i = 1 } ^ { m }  }  {\Pi _ {n} ( dx _ {i} ) } ,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001028.png" /></td> </tr></table>
+
$$
 +
\Pi _ {n} ( dx ) = {
 +
\frac{1}{n}
 +
} \sum _ {i = 1 } ^ { n }  \delta _ {X _ {i}  }  ( dx )
 +
$$
  
is the empirical distribution, can be represented by a linear combination of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001029.png" />-statistics [[#References|[a2]]]. For the primitive kernel <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001031.png" />, the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001032.png" />-statistic
+
is the empirical distribution, can be represented by a linear combination of $  U $-
 +
statistics [[#References|[a2]]]. For the primitive kernel $  \Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {i = 1 }  ^ {m} \phi ( x _ {i} ) $,  
 +
the $  U $-
 +
statistic
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001033.png" /></td> </tr></table>
+
$$
 +
U _ {n}  ^ {m} ( \Phi _ {m} ) = n ^ {- [ m ] } \sum _ {1 \leq  j _ {1} \neq \dots \neq j _ {m} \leq  n } \prod _ {c = 1 } ^ { m }  \varphi ( x _ {c} )
 +
$$
  
is a symmetric polynomial statistic of the random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001034.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001035.png" />.
+
is a symmetric polynomial statistic of the random variables $  y _ {k} = \varphi ( x _ {k} ) $,  
 +
$  1 \leq  k \leq  n $.
  
The starting point of the analysis of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001036.png" />-statistics is the [[Hoeffding decomposition|Hoeffding decomposition]] of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001037.png" />-statistics, [[#References|[a1]]]:
+
The starting point of the analysis of $  U $-
 +
statistics is the [[Hoeffding decomposition|Hoeffding decomposition]] of $  U $-
 +
statistics, [[#References|[a1]]]:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001038.png" /></td> </tr></table>
+
$$
 +
U _ {n}  ^ {m} ( \Phi ) = {\mathsf E} \Phi + \sum _ {c = r } ^ { m }  \left ( \begin{array}{c}
 +
m \\
 +
c
 +
\end{array}
 +
\right ) U _ {n}  ^ {c} ( g _ {c} ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001039.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001040.png" />, are completely degenerate kernels: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001041.png" />. The integer <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001042.png" /> is called the rank of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001044.png" />-statistic. Here, by definition, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001045.png" /> is the mean value of the kernel and, also, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001046.png" />. Therefore, an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001047.png" />-statistic is an [[Unbiased estimator|unbiased estimator]] of the functional <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001048.png" />.
+
where $  g _ {c} = g _ {c} ( x _ {1} \dots x _ {c} ) $,  
 +
$  r \leq  c \leq  m $,  
 +
are completely degenerate kernels: $  {\mathsf E} g _ {c} ( X _ {1} \dots X _ {c} ) = 0 $.  
 +
The integer $  r \geq  1 $
 +
is called the rank of the $  U $-
 +
statistic. Here, by definition, $  {\mathsf E} \Phi $
 +
is the mean value of the kernel and, also, $  {\mathsf E} U _ {n}  ^ {m} ( \Phi ) = {\mathsf E} \Phi $.  
 +
Therefore, an $  U $-
 +
statistic is an [[Unbiased estimator|unbiased estimator]] of the functional $  \theta = {\mathsf E} \Phi $.
  
The theory of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001049.png" />-statistics, founded by W. Hoeffding in the seminal work [[#References|[a1]]], published in 1948, was developed under the impact of the theory of sums of independent random variables. The [[Law of large numbers|law of large numbers]], the [[Central limit theorem|central limit theorem]], the [[Law of the iterated logarithm|law of the iterated logarithm]], etc. were investigated in various works (see the references in [[#References|[a3]]]). The asymptotic behaviour of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001050.png" />-statistics can be reduced to the analysis of sums of independent identically distributed random variables. For a non-degenerate kernel <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001051.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001052.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001053.png" /> there is weak convergence (as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001054.png" />; cf. also [[Convergence, types of|Convergence, types of]])):
+
The theory of $  U $-
 +
statistics, founded by W. Hoeffding in the seminal work [[#References|[a1]]], published in 1948, was developed under the impact of the theory of sums of independent random variables. The [[Law of large numbers|law of large numbers]], the [[Central limit theorem|central limit theorem]], the [[Law of the iterated logarithm|law of the iterated logarithm]], etc. were investigated in various works (see the references in [[#References|[a3]]]). The asymptotic behaviour of $  U $-
 +
statistics can be reduced to the analysis of sums of independent identically distributed random variables. For a non-degenerate kernel $  \Phi $
 +
with $  {\mathsf E} \Phi = 0 $
 +
and  $  {\mathsf E} | \Phi | ^ {4/3 } < \infty $
 +
there is weak convergence (as $  n \rightarrow \infty $;  
 +
cf. also [[Convergence, types of|Convergence, types of]])):
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001055.png" /></td> </tr></table>
+
$$
 +
{
 +
\frac{\sqrt n U _ {n}  ^ {m} ( \Phi ) }{m \sigma }
 +
} \Rightarrow \tau,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001056.png" /> is a random variable with standard Gaussian distribution with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001057.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001058.png" />. Here, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001059.png" />.
+
where $  \tau $
 +
is a random variable with standard Gaussian distribution with $  {\mathsf E} \tau = 0 $
 +
and $  {\mathsf E} \tau  ^ {2} = 1 $.  
 +
Here, $  \sigma  ^ {2} = {\mathsf E} g _ {1}  ^ {2} $.
  
For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001060.png" /> the limit distribution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001061.png" />-statistics depends essentially on the kernel. For a primitive completely degenerate kernel
+
For $  r \geq  2 $
 +
the limit distribution of $  U $-
 +
statistics depends essentially on the kernel. For a primitive completely degenerate kernel
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001062.png" /></td> </tr></table>
+
$$
 +
\Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {c = 1 } ^ { m }  \varphi ( x _ {c} )
 +
$$
  
with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001063.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001064.png" />, there is weak convergence (as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001065.png" />):
+
with $  {\mathsf E} \varphi ( X _ {1} ) = 0 $
 +
and  $  {\mathsf E} \varphi  ^ {2} ( X _ {1} ) = \sigma  ^ {2} < \infty $,  
 +
there is weak convergence (as $  n \rightarrow \infty $):
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001066.png" /></td> </tr></table>
+
$$
 +
{
 +
\frac{n ^ {m/2 } U _ {n}  ^ {m} ( \Phi _ {m} ) }{\sigma  ^ {m} }
 +
} \Rightarrow H _ {m} ( \tau ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001067.png" /> is the Hermite polynomial of degree <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001068.png" /> [[#References|[a7]]] (cf. also [[Hermite polynomials|Hermite polynomials]]).
+
where $  H _ {m} ( x ) $
 +
is the Hermite polynomial of degree $  m $[[#References|[a7]]] (cf. also [[Hermite polynomials|Hermite polynomials]]).
  
<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001069.png" />-statistics with completely degenerate kernel, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001070.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001071.png" />, converge weakly (as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001072.png" />) to the Itô–Wiener stochastic integral [[#References|[a3]]], [[#References|[a5]]]:
+
$  U $-
 +
statistics with completely degenerate kernel, $  {\mathsf E} g ( x _ {1} \dots x _ {m} ) = 0 $
 +
and $  {\mathsf E} g  ^ {2} < \infty $,  
 +
converge weakly (as $  n \rightarrow \infty $)  
 +
to the Itô–Wiener stochastic integral [[#References|[a3]]], [[#References|[a5]]]:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001073.png" /></td> </tr></table>
+
$$
 +
n ^ {m/2 } U _ {n}  ^ {m} ( g ) \Rightarrow \int\limits _ {X  ^ {m} } {g ( x _ {1} \dots x _ {m} ) \prod _ {c = 1 } ^ { m }  }  {W ( dx _ {c} ) } .
 +
$$
  
<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001074.png" />-statistics can also be represented by the [[Stochastic integral|stochastic integral]] with respect to the permanent random measure, as follows, [[#References|[a3]]],
+
$  U $-
 +
statistics can also be represented by the [[Stochastic integral|stochastic integral]] with respect to the permanent random measure, as follows, [[#References|[a3]]],
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001075.png" /></td> </tr></table>
+
$$
 +
U _ {n}  ^ {m} ( g ) =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001076.png" /></td> </tr></table>
+
$$
 +
=  
 +
n ^ {- [ m ] } \int\limits _ {X  ^ {m} } {g ( x _ {1} \dots x _ {m} ) }  {\Delta _ {n}  ^ {m} ( dx _ {1} \dots dx _ {m} ) } ,
 +
$$
  
 
where
 
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001077.png" /></td> </tr></table>
+
$$
 +
\Delta _ {n}  ^ {m} ( dx _ {1} \dots dx _ {m} ) =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001078.png" /></td> </tr></table>
+
$$
 +
=  
 +
\sum _ {1 \leq  j _ {1} \neq \dots \neq j _ {m} \leq  n } \prod _ {c = 1 } ^ { m }  [ \delta _ {x _ {j _ {c}  } } ( dx _ {c} ) - P ( dx _ {c} ) ] .
 +
$$
  
The asymptotic analysis of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001079.png" />-statistics is based on the [[Martingale|martingale]] structure of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001080.png" />-statistics and involves functional limit theorems, rate of convergence, almost sure convergence, asymptotic expansions, and probability of large deviations.
+
The asymptotic analysis of $  U $-
 +
statistics is based on the [[Martingale|martingale]] structure of $  U $-
 +
statistics and involves functional limit theorems, rate of convergence, almost sure convergence, asymptotic expansions, and probability of large deviations.
  
The contemporary development of the theory of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001081.png" />-statistics contains various generalizations: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001082.png" />-statistics with kernel taking values in a Hilbert or Banach space [[#References|[a8]]], multi-sampling <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001083.png" />-statistics, bootstrap and truncated <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001084.png" />-statistics, weighted <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001085.png" />-statistics, etc. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001086.png" />-statistics with kernel depending on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/u/u110/u110010/u11001087.png" /> are used in non-parametric density and regression estimation [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]], [[#References|[a5]]], [[#References|[a6]]].
+
The contemporary development of the theory of $  U $-
 +
statistics contains various generalizations: $  U $-
 +
statistics with kernel taking values in a Hilbert or Banach space [[#References|[a8]]], multi-sampling $  U $-
 +
statistics, bootstrap and truncated $  U $-
 +
statistics, weighted $  U $-
 +
statistics, etc. $  U $-
 +
statistics with kernel depending on $  n $
 +
are used in non-parametric density and regression estimation [[#References|[a2]]], [[#References|[a3]]], [[#References|[a4]]], [[#References|[a5]]], [[#References|[a6]]].
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  W. Hoeffding,  "A class of statistics with asymptotically normal distribution"  ''Ann. Math. Stat.'' , '''19'''  (1948)  pp. 293–325</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  A.J. Lee,  "U-statistics. Theory and practice" , ''Statistics textbooks and monographs'' , '''110''' , M. Dekker  (1990)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  V.S. Korolyuk,  Yu.V. Borovskikh,  "Theory of U-statistics" , Kluwer Acad. Publ.  (1994)  (In Russian)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  R.J. Serfling,  "Approximation: theorems of mathematical statistics" , Wiley  (1980)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  E.B. Dynkin,  A. Mandelbaum,  "Symmetric statistics, Poisson point process and multiple Wiener integrals"  ''Ann. Stat.'' , '''11'''  (1983)  pp. 739–745</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  M. Denker,  "Asymptotic theory in nonparametric statistics" , Vieweg  (1985)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  V.S. Korolyuk,  Yu.V. Borovskikh,  "Random permanents" , VSP  (1994)  (In Russian)</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  Yu.V. Borovskikh,  "U-statistics in Banach space" , VSP  (1995)  (In Russian)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  W. Hoeffding,  "A class of statistics with asymptotically normal distribution"  ''Ann. Math. Stat.'' , '''19'''  (1948)  pp. 293–325</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  A.J. Lee,  "U-statistics. Theory and practice" , ''Statistics textbooks and monographs'' , '''110''' , M. Dekker  (1990)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  V.S. Korolyuk,  Yu.V. Borovskikh,  "Theory of U-statistics" , Kluwer Acad. Publ.  (1994)  (In Russian)</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  R.J. Serfling,  "Approximation: theorems of mathematical statistics" , Wiley  (1980)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  E.B. Dynkin,  A. Mandelbaum,  "Symmetric statistics, Poisson point process and multiple Wiener integrals"  ''Ann. Stat.'' , '''11'''  (1983)  pp. 739–745</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  M. Denker,  "Asymptotic theory in nonparametric statistics" , Vieweg  (1985)</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  V.S. Korolyuk,  Yu.V. Borovskikh,  "Random permanents" , VSP  (1994)  (In Russian)</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  Yu.V. Borovskikh,  "U-statistics in Banach space" , VSP  (1995)  (In Russian)</TD></TR></table>

Latest revision as of 08:27, 6 June 2020


A sum

$$ U _ {n} ^ {m} ( \Phi ) = \left ( \begin{array}{c} n \\ m \end{array} \right ) ^ {- 1 } \sum _ {1 \leq i _ {1} < \dots < i _ {m} \leq n } \Phi ( X _ {i _ {1} } \dots X _ {i _ {m} } ) . $$

Hoeffding's form for $ U $- statistics is [a1]:

$$ U _ {n} ^ {m} ( \Phi ) : = n ^ {- [ m ] } \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \Phi ( X _ {j _ {1} } \dots X _ {j _ {m} } ) . $$

The kernel of a $ U $- statistic, $ \Phi : {X ^ {m} } \rightarrow \mathbf R $, is a symmetric real-valued function of $ m $ variables. The random variables $ X _ {1} \dots X _ {n} $( cf. also Random variable) are independent identically distributed with common distribution function $ {\mathsf P} ( A ) $ on a measurable space $ ( X, {\mathcal X} ) $, $ A \in {\mathcal X} $. The number $ m \leq n $ is called the degree of the $ U $- statistic. The number of terms in the sum is equal to

$$ \left ( \begin{array}{c} n \\ m \end{array} \right ) = { \frac{n! }{m! ( n - m ) ! } } $$

in the first sum and to

$$ n ^ {[ m ] } = { \frac{n! }{( n - m ) ! } } = n ( n - 1 ) \dots ( n - m + 1 ) $$

in the second sum. Also, $ n ^ {- [ m ] } = {1 / {n ^ {[ m ] } } } $.

Various statistics can be represented as $ U $- statistics or can be approximated by $ U $- statistics with a suitable choice of the kernel $ \Phi $. For example, the sampling variance

$$ S _ {n} = { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - {\overline{x}\; } ) ^ {2} = U _ {n} ^ {2} ( \Phi ) $$

can be obtained using the kernel $ \Phi ( x _ {1} , x _ {2} ) = { {( x _ {1} - x _ {2} ) ^ {2} } / 2 } $. Here,

$$ {\overline{x}\; } = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } X _ {i} $$

is the mean value of the sample. The von Mises functional, given by

$$ V _ {n} ^ {m} ( \Phi ) = n ^ {- m } \sum _ {( i _ {1} \dots i _ {m} ) = 1 } ^ { n } \Phi ( X _ {i _ {1} } \dots X _ {i _ {m} } ) = $$

$$ = \int\limits _ {X ^ {m} } {\Phi ( x _ {1} \dots x _ {m} ) \prod _ {i = 1 } ^ { m } } {\Pi _ {n} ( dx _ {i} ) } , $$

where

$$ \Pi _ {n} ( dx ) = { \frac{1}{n} } \sum _ {i = 1 } ^ { n } \delta _ {X _ {i} } ( dx ) $$

is the empirical distribution, can be represented by a linear combination of $ U $- statistics [a2]. For the primitive kernel $ \Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {i = 1 } ^ {m} \phi ( x _ {i} ) $, the $ U $- statistic

$$ U _ {n} ^ {m} ( \Phi _ {m} ) = n ^ {- [ m ] } \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \prod _ {c = 1 } ^ { m } \varphi ( x _ {c} ) $$

is a symmetric polynomial statistic of the random variables $ y _ {k} = \varphi ( x _ {k} ) $, $ 1 \leq k \leq n $.

The starting point of the analysis of $ U $- statistics is the Hoeffding decomposition of $ U $- statistics, [a1]:

$$ U _ {n} ^ {m} ( \Phi ) = {\mathsf E} \Phi + \sum _ {c = r } ^ { m } \left ( \begin{array}{c} m \\ c \end{array} \right ) U _ {n} ^ {c} ( g _ {c} ) , $$

where $ g _ {c} = g _ {c} ( x _ {1} \dots x _ {c} ) $, $ r \leq c \leq m $, are completely degenerate kernels: $ {\mathsf E} g _ {c} ( X _ {1} \dots X _ {c} ) = 0 $. The integer $ r \geq 1 $ is called the rank of the $ U $- statistic. Here, by definition, $ {\mathsf E} \Phi $ is the mean value of the kernel and, also, $ {\mathsf E} U _ {n} ^ {m} ( \Phi ) = {\mathsf E} \Phi $. Therefore, an $ U $- statistic is an unbiased estimator of the functional $ \theta = {\mathsf E} \Phi $.

The theory of $ U $- statistics, founded by W. Hoeffding in the seminal work [a1], published in 1948, was developed under the impact of the theory of sums of independent random variables. The law of large numbers, the central limit theorem, the law of the iterated logarithm, etc. were investigated in various works (see the references in [a3]). The asymptotic behaviour of $ U $- statistics can be reduced to the analysis of sums of independent identically distributed random variables. For a non-degenerate kernel $ \Phi $ with $ {\mathsf E} \Phi = 0 $ and $ {\mathsf E} | \Phi | ^ {4/3 } < \infty $ there is weak convergence (as $ n \rightarrow \infty $; cf. also Convergence, types of)):

$$ { \frac{\sqrt n U _ {n} ^ {m} ( \Phi ) }{m \sigma } } \Rightarrow \tau, $$

where $ \tau $ is a random variable with standard Gaussian distribution with $ {\mathsf E} \tau = 0 $ and $ {\mathsf E} \tau ^ {2} = 1 $. Here, $ \sigma ^ {2} = {\mathsf E} g _ {1} ^ {2} $.

For $ r \geq 2 $ the limit distribution of $ U $- statistics depends essentially on the kernel. For a primitive completely degenerate kernel

$$ \Phi _ {m} ( x _ {1} \dots x _ {m} ) = \prod _ {c = 1 } ^ { m } \varphi ( x _ {c} ) $$

with $ {\mathsf E} \varphi ( X _ {1} ) = 0 $ and $ {\mathsf E} \varphi ^ {2} ( X _ {1} ) = \sigma ^ {2} < \infty $, there is weak convergence (as $ n \rightarrow \infty $):

$$ { \frac{n ^ {m/2 } U _ {n} ^ {m} ( \Phi _ {m} ) }{\sigma ^ {m} } } \Rightarrow H _ {m} ( \tau ) , $$

where $ H _ {m} ( x ) $ is the Hermite polynomial of degree $ m $[a7] (cf. also Hermite polynomials).

$ U $- statistics with completely degenerate kernel, $ {\mathsf E} g ( x _ {1} \dots x _ {m} ) = 0 $ and $ {\mathsf E} g ^ {2} < \infty $, converge weakly (as $ n \rightarrow \infty $) to the Itô–Wiener stochastic integral [a3], [a5]:

$$ n ^ {m/2 } U _ {n} ^ {m} ( g ) \Rightarrow \int\limits _ {X ^ {m} } {g ( x _ {1} \dots x _ {m} ) \prod _ {c = 1 } ^ { m } } {W ( dx _ {c} ) } . $$

$ U $- statistics can also be represented by the stochastic integral with respect to the permanent random measure, as follows, [a3],

$$ U _ {n} ^ {m} ( g ) = $$

$$ = n ^ {- [ m ] } \int\limits _ {X ^ {m} } {g ( x _ {1} \dots x _ {m} ) } {\Delta _ {n} ^ {m} ( dx _ {1} \dots dx _ {m} ) } , $$

where

$$ \Delta _ {n} ^ {m} ( dx _ {1} \dots dx _ {m} ) = $$

$$ = \sum _ {1 \leq j _ {1} \neq \dots \neq j _ {m} \leq n } \prod _ {c = 1 } ^ { m } [ \delta _ {x _ {j _ {c} } } ( dx _ {c} ) - P ( dx _ {c} ) ] . $$

The asymptotic analysis of $ U $- statistics is based on the martingale structure of $ U $- statistics and involves functional limit theorems, rate of convergence, almost sure convergence, asymptotic expansions, and probability of large deviations.

The contemporary development of the theory of $ U $- statistics contains various generalizations: $ U $- statistics with kernel taking values in a Hilbert or Banach space [a8], multi-sampling $ U $- statistics, bootstrap and truncated $ U $- statistics, weighted $ U $- statistics, etc. $ U $- statistics with kernel depending on $ n $ are used in non-parametric density and regression estimation [a2], [a3], [a4], [a5], [a6].

References

[a1] W. Hoeffding, "A class of statistics with asymptotically normal distribution" Ann. Math. Stat. , 19 (1948) pp. 293–325
[a2] A.J. Lee, "U-statistics. Theory and practice" , Statistics textbooks and monographs , 110 , M. Dekker (1990)
[a3] V.S. Korolyuk, Yu.V. Borovskikh, "Theory of U-statistics" , Kluwer Acad. Publ. (1994) (In Russian)
[a4] R.J. Serfling, "Approximation: theorems of mathematical statistics" , Wiley (1980)
[a5] E.B. Dynkin, A. Mandelbaum, "Symmetric statistics, Poisson point process and multiple Wiener integrals" Ann. Stat. , 11 (1983) pp. 739–745
[a6] M. Denker, "Asymptotic theory in nonparametric statistics" , Vieweg (1985)
[a7] V.S. Korolyuk, Yu.V. Borovskikh, "Random permanents" , VSP (1994) (In Russian)
[a8] Yu.V. Borovskikh, "U-statistics in Banach space" , VSP (1995) (In Russian)
How to Cite This Entry:
U-statistic. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=U-statistic&oldid=13765
This article was adapted from an original article by V.S. Korolyuk (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article