Namespaces
Variants
Actions

Difference between revisions of "Hoeffding decomposition"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102201.png" /> be independent identically distributed random functions with values in a [[Measurable space|measurable space]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102202.png" /> (cf. [[Random variable|Random variable]]). For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102203.png" />, let
+
<!--
 +
h1102201.png
 +
$#A+1 = 25 n = 0
 +
$#C+1 = 25 : ~/encyclopedia/old_files/data/H110/H.1100220 Hoeffding decomposition
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102204.png" /></td> </tr></table>
+
{{TEX|auto}}
 +
{{TEX|done}}
  
be a measurable symmetric function in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102205.png" /> variables and consider the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102206.png" />-statistics (cf. [[U-statistic|<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102207.png" />-statistic]])
+
Let  $  X _ {1} \dots X _ {N} $
 +
be independent identically distributed random functions with values in a [[Measurable space|measurable space]]  $  ( E, {\mathcal E} ) $(
 +
cf. [[Random variable|Random variable]]). For  $  m < N $,
 +
let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102208.png" /></td> </tr></table>
+
$$
 +
h : {E  ^ {m} } \rightarrow \mathbf R
 +
$$
  
The following theorem is called Hoeffding's decomposition theorem, and the representation of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h1102209.png" />-statistic as in the theorem is called the Hoeffding decomposition of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022011.png" /> (see [[#References|[a1]]]):
+
be a measurable symmetric function in  $  m $
 +
variables and consider the $  U $-
 +
statistics (cf. [[U-statistic| $  U $-
 +
statistic]])
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022012.png" /></td> </tr></table>
+
$$
 +
U _ {N} ( h ) = {
 +
\frac{1}{\left ( \begin{array}{c}
 +
N \\
 +
m
 +
\end{array}
 +
\right ) }
 +
} \sum _ {1 \leq  i _ {1} < \dots < i _ {m} \leq  N } h ( X _ {i _ {1}  } \dots X _ {i _ {m}  } ) .
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022013.png" /> is a symmetric function in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022014.png" /> arguments and where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022015.png" />-statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022016.png" /> are degenerate, pairwise orthogonal in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022017.png" /> (uncorrelated) and satisfy
+
The following theorem is called Hoeffding's decomposition theorem, and the representation of the  $  U $-
 +
statistic as in the theorem is called the Hoeffding decomposition of  $  U _ {N} ( h ) $(
 +
see [[#References|[a1]]]):
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022018.png" /></td> </tr></table>
+
$$
 +
U _ {N} ( h ) = \sum _ {c = 0 } ^ { m }  \left ( \begin{array}{c}
 +
m \\
 +
c
 +
\end{array}
 +
\right ) U _ {N} ( h _ {c} ) ,
 +
$$
  
The symmetric functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022019.png" /> are defined as follows:
+
where  $  {h _ {c} } : {E  ^ {c} } \rightarrow \mathbf R $
 +
is a symmetric function in  $  c $
 +
arguments and where the  $  U $-
 +
statistics  $  U _ {N} ( h _ {c} ) $
 +
are degenerate, pairwise orthogonal in  $  L _ {2} $(
 +
uncorrelated) and satisfy
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022020.png" /></td> </tr></table>
+
$$
 +
{\mathsf E} ( U _ {N} ( h _ {c} ) )  ^ {2} = {\mathsf E} ( h _ {c} ( X _ {1} \dots X _ {c} ) )  ^ {2} .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022021.png" /></td> </tr></table>
+
The symmetric functions  $  h _ {c} $
 +
are defined as follows:
  
Extensions of this decomposition are known for the multi-sample case [[#References|[a4]]], under various  "uncomplete"  summation procedures in the definition of a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022022.png" />-statistic, in some dependent situations and for non-identical distributions [[#References|[a3]]]. There are also versions of the theorem for symmetric functions that have values in a [[Banach space|Banach space]].
+
$$
 +
h _ {c} ( x _ {1} \dots x _ {c} ) = \sum _ {k = 0 } ^ { c }  ( - 1 ) ^ {c - k } \times
 +
$$
  
The decomposition theorem permits one to easily calculate the variance of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022023.png" />-statistics. Since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022024.png" /> and since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022025.png" /> is a sum of centred independent identically distributed random variables, the central limit theorem for non-degenerate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/h/h110/h110220/h11022027.png" />-statistics is an immediate consequence of the Hoeffding decomposition (cf. also [[Central limit theorem|Central limit theorem]]).
+
$$
 +
\times
 +
\sum _ {1 \leq  l _ {1} < \dots < l _ {k} \leq  c } E ( h ( x _ {l _ {1}  } \dots x _ {l _ {k}  } ,X _ {1} \dots X _ {m - k }  ) ) .
 +
$$
 +
 
 +
Extensions of this decomposition are known for the multi-sample case [[#References|[a4]]], under various  "uncomplete"  summation procedures in the definition of a  $  U $-
 +
statistic, in some dependent situations and for non-identical distributions [[#References|[a3]]]. There are also versions of the theorem for symmetric functions that have values in a [[Banach space|Banach space]].
 +
 
 +
The decomposition theorem permits one to easily calculate the variance of $  U $-
 +
statistics. Since $  U _ {N} ( h _ {0} ) = {\mathsf E} h ( X _ {1} \dots X _ {m} ) $
 +
and since $  U _ {N} ( h _ {1} ) $
 +
is a sum of centred independent identically distributed random variables, the central limit theorem for non-degenerate $  U $-
 +
statistics is an immediate consequence of the Hoeffding decomposition (cf. also [[Central limit theorem|Central limit theorem]]).
  
 
The terminology goes back to [[#References|[a2]]].
 
The terminology goes back to [[#References|[a2]]].

Latest revision as of 22:10, 5 June 2020


Let $ X _ {1} \dots X _ {N} $ be independent identically distributed random functions with values in a measurable space $ ( E, {\mathcal E} ) $( cf. Random variable). For $ m < N $, let

$$ h : {E ^ {m} } \rightarrow \mathbf R $$

be a measurable symmetric function in $ m $ variables and consider the $ U $- statistics (cf. $ U $- statistic)

$$ U _ {N} ( h ) = { \frac{1}{\left ( \begin{array}{c} N \\ m \end{array} \right ) } } \sum _ {1 \leq i _ {1} < \dots < i _ {m} \leq N } h ( X _ {i _ {1} } \dots X _ {i _ {m} } ) . $$

The following theorem is called Hoeffding's decomposition theorem, and the representation of the $ U $- statistic as in the theorem is called the Hoeffding decomposition of $ U _ {N} ( h ) $( see [a1]):

$$ U _ {N} ( h ) = \sum _ {c = 0 } ^ { m } \left ( \begin{array}{c} m \\ c \end{array} \right ) U _ {N} ( h _ {c} ) , $$

where $ {h _ {c} } : {E ^ {c} } \rightarrow \mathbf R $ is a symmetric function in $ c $ arguments and where the $ U $- statistics $ U _ {N} ( h _ {c} ) $ are degenerate, pairwise orthogonal in $ L _ {2} $( uncorrelated) and satisfy

$$ {\mathsf E} ( U _ {N} ( h _ {c} ) ) ^ {2} = {\mathsf E} ( h _ {c} ( X _ {1} \dots X _ {c} ) ) ^ {2} . $$

The symmetric functions $ h _ {c} $ are defined as follows:

$$ h _ {c} ( x _ {1} \dots x _ {c} ) = \sum _ {k = 0 } ^ { c } ( - 1 ) ^ {c - k } \times $$

$$ \times \sum _ {1 \leq l _ {1} < \dots < l _ {k} \leq c } E ( h ( x _ {l _ {1} } \dots x _ {l _ {k} } ,X _ {1} \dots X _ {m - k } ) ) . $$

Extensions of this decomposition are known for the multi-sample case [a4], under various "uncomplete" summation procedures in the definition of a $ U $- statistic, in some dependent situations and for non-identical distributions [a3]. There are also versions of the theorem for symmetric functions that have values in a Banach space.

The decomposition theorem permits one to easily calculate the variance of $ U $- statistics. Since $ U _ {N} ( h _ {0} ) = {\mathsf E} h ( X _ {1} \dots X _ {m} ) $ and since $ U _ {N} ( h _ {1} ) $ is a sum of centred independent identically distributed random variables, the central limit theorem for non-degenerate $ U $- statistics is an immediate consequence of the Hoeffding decomposition (cf. also Central limit theorem).

The terminology goes back to [a2].

References

[a1] M. Denker, "Asymptotic distribution theory in nonparametric statistics" , Advanced Lectures in Mathematics , F. Vieweg (1985)
[a2] W. Hoeffding, "A class of statistics with asymptotically normal distribution" Ann. Math. Stat. , 19 (1948) pp. 293–325
[a3] A.J. Lee, "U-statistics. Theory and practice" , Statistics textbooks and monographs , 110 , M. Dekker (1990)
[a4] E.L. Lehmann, "Consistency and unbiasedness of certain nonparametric tests" Ann. Math. Stat. , 22 (1951) pp. 165–179
How to Cite This Entry:
Hoeffding decomposition. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Hoeffding_decomposition&oldid=47242
This article was adapted from an original article by M. Denker (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article