Namespaces
Variants
Actions

Difference between revisions of "Contiguity of probability measures"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (AUTOMATIC EDIT (latexlist): Replaced 146 formulas out of 147 by TEX code with an average confidence of 2.0 and a minimal confidence of 2.0.)
Line 1: Line 1:
 +
<!--This article has been texified automatically. Since there was no Nroff source code for this article,
 +
the semi-automatic procedure described at https://encyclopediaofmath.org/wiki/User:Maximilian_Janisch/latexlist
 +
was used.
 +
If the TeX and formula formatting is correct, please remove this message and the {{TEX|semi-auto}} category.
 +
 +
Out of 147 formulas, 146 were replaced by TEX code.-->
 +
 +
{{TEX|semi-auto}}{{TEX|partial}}
 
The concept of contiguity was formally introduced and developed by L. Le Cam in [[#References|[a7]]]. It refers to sequences of probability measures, and is meant to be a measure of  "closeness"  or  "nearness"  of such sequences (cf. also [[Probability measure|Probability measure]]). It may also be viewed as a kind of uniform asymptotic mutual [[Absolute continuity|absolute continuity]] of probability measures. Actually, the need for the introduction of such a concept arose as early as 1955 or 1956, and it was at that time that Le Cam selected the name of  "contiguity" , with the help of J.D. Esary (see [[#References|[a9]]], p. 29).
 
The concept of contiguity was formally introduced and developed by L. Le Cam in [[#References|[a7]]]. It refers to sequences of probability measures, and is meant to be a measure of  "closeness"  or  "nearness"  of such sequences (cf. also [[Probability measure|Probability measure]]). It may also be viewed as a kind of uniform asymptotic mutual [[Absolute continuity|absolute continuity]] of probability measures. Actually, the need for the introduction of such a concept arose as early as 1955 or 1956, and it was at that time that Le Cam selected the name of  "contiguity" , with the help of J.D. Esary (see [[#References|[a9]]], p. 29).
  
There are several equivalent characterizations of contiguity, and the following may serve as its definition. Two sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202101.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202102.png" /> are said to be contiguous if for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202103.png" /> for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202104.png" />, it also happens that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202105.png" />, and vice versa, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202106.png" /> is a sequence of measurable spaces and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202107.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202108.png" /> are measures on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c1202109.png" />. Here and in the sequel, all limits are taken as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021010.png" />. It is worth mentioning at this point that contiguity is transitive: If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021011.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021012.png" /> are contiguous and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021013.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021014.png" /> are contiguous, then so are <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021015.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021016.png" />. Contiguity simplifies many arguments in passing to the limit, and it plays a major role in the asymptotic theory of statistical inference (cf. also [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]). Thus, contiguity is used in parts of [[#References|[a8]]] as a tool of obtaining asymptotic results in an elegant manner; [[#References|[a9]]] is a more accessible general reference on contiguity and its usages. In a Markovian framework, contiguity, some related results and selected statistical applications are discussed in [[#References|[a11]]]. For illustrative purposes, [[#References|[a11]]] can be used as standard reference.
+
There are several equivalent characterizations of contiguity, and the following may serve as its definition. Two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are said to be contiguous if for any $A _ { n } \in\mathcal{ A} _ { n }$ for which $P _ { n } ( A _ { n } ) \rightarrow 0$, it also happens that $P _ { n } ^ { \prime } ( A _ { n } ) \rightarrow 0$, and vice versa, where $( {\cal X , A} _ { n } )$ is a sequence of measurable spaces and $P_n$ and $P _ { n } ^ { \prime }$ are measures on $\mathcal{A} _ { n }$. Here and in the sequel, all limits are taken as $n \rightarrow \infty$. It is worth mentioning at this point that contiguity is transitive: If $\{ P _ { n } \}$, $\{ P _ { n } ^ { \prime } \}$ are contiguous and $\{ P _ { n } ^ { \prime } \}$, $\{ P _ { n } ^ { \prime \prime } \}$ are contiguous, then so are $\{ P _ { n } \}$, $\{ P _ { n } ^ { \prime \prime } \}$. Contiguity simplifies many arguments in passing to the limit, and it plays a major role in the asymptotic theory of statistical inference (cf. also [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]). Thus, contiguity is used in parts of [[#References|[a8]]] as a tool of obtaining asymptotic results in an elegant manner; [[#References|[a9]]] is a more accessible general reference on contiguity and its usages. In a Markovian framework, contiguity, some related results and selected statistical applications are discussed in [[#References|[a11]]]. For illustrative purposes, [[#References|[a11]]] can be used as standard reference.
  
The definition of contiguity calls for its comparison with more familiar modes of  "closeness" , such as that based on the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021017.png" /> (or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021018.png" />) norm, defined by
+
The definition of contiguity calls for its comparison with more familiar modes of  "closeness" , such as that based on the $\operatorname {sup}$ (or $L_1$) norm, defined by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021019.png" /></td> </tr></table>
+
\begin{equation*} \| P _ { n } - P _ { n } ^ { \prime } \| = 2 \operatorname { sup } \{ | P _ { n } ( A ) - P _ { n } ^ { \prime } ( A ) | : A \in \mathcal{A} _ { n } \}, \end{equation*}
  
and also the concept of mutual absolute continuity (cf. also [[Absolute continuity|Absolute continuity]]), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021020.png" />. It is always true that convergence in the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021021.png" />-norm implies contiguity, but the converse is not true (see, e.g., [[#References|[a11]]], p. 12; the special case of Example 3.1(i)). So, contiguity is a weaker measure of  "closeness"  of two sequences of probability measures than that provided by sup-norm convergence. Also, by means of examples, it may be illustrated that it can happen that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021022.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021023.png" /> (i.e., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021024.png" /> if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021025.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021026.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021027.png" />) whereas <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021028.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021029.png" /> are not contiguous (see, e.g., [[#References|[a11]]], pp. 9–10; Example 2.2). That contiguity need not imply absolute continuity for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021030.png" /> is again demonstrated by examples (see, e.g., [[#References|[a11]]], p. 9; Example 2.1 and Remark 2.3). This should not come as a surprise, since contiguity is interpreted as asymptotic absolute continuity rather than absolute continuity for any finite <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021031.png" />. It is to be noted, however, that a pair of contiguous sequences of probability measures can always be replaced by another pair of contiguous sequences whose respective members are mutually absolutely continuous and lie arbitrarily close to the given ones in the sup-norm sense (see, e.g., [[#References|[a11]]], p. 25–26; Thm. 5.1).
+
and also the concept of mutual absolute continuity (cf. also [[Absolute continuity|Absolute continuity]]), $P _ { n } \approx P _ { n } ^ { \prime }$. It is always true that convergence in the $L_1$-norm implies contiguity, but the converse is not true (see, e.g., [[#References|[a11]]], p. 12; the special case of Example 3.1(i)). So, contiguity is a weaker measure of  "closeness"  of two sequences of probability measures than that provided by sup-norm convergence. Also, by means of examples, it may be illustrated that it can happen that $P _ { n } \approx P _ { n } ^ { \prime }$ for all $n$ (i.e., $P _ { n } ( A ) = 0$ if and only if $P _ { n } ^ { \prime } ( A ) = 0$ for all $n$, $A \in {\cal A} _ { n }$) whereas $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are not contiguous (see, e.g., [[#References|[a11]]], pp. 9–10; Example 2.2). That contiguity need not imply absolute continuity for any $n$ is again demonstrated by examples (see, e.g., [[#References|[a11]]], p. 9; Example 2.1 and Remark 2.3). This should not come as a surprise, since contiguity is interpreted as asymptotic absolute continuity rather than absolute continuity for any finite $n$. It is to be noted, however, that a pair of contiguous sequences of probability measures can always be replaced by another pair of contiguous sequences whose respective members are mutually absolutely continuous and lie arbitrarily close to the given ones in the sup-norm sense (see, e.g., [[#References|[a11]]], p. 25–26; Thm. 5.1).
  
The concept exactly opposite to contiguity is that of (asymptotic) entire separation. Thus, two sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021032.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021033.png" /> are said to be (asymptotically) entirely separated if there exist <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021034.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021035.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021036.png" /> whereas <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021037.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021038.png" /> (see [[#References|[a2]]], p. 24).
+
The concept exactly opposite to contiguity is that of (asymptotic) entire separation. Thus, two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are said to be (asymptotically) entirely separated if there exist $\{ m \} \subseteq \{ n \}$ and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021035.png"/> such that $P _ { m } ( A _ { m } ) \rightarrow 0$ whereas $P _ { m } ^ { \prime } ( A _ { m } ) \rightarrow 1$ as $m \rightarrow \infty$ (see [[#References|[a2]]], p. 24).
  
Alternative characterizations of contiguity are provided in [[#References|[a11]]], Def. 2.1; Prop. 3.1; Prop. 6.1. In terms of sequences of random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021039.png" />, two sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021040.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021041.png" /> are contiguous if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021042.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021043.png" />-probability implies <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021044.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021045.png" />-probability, and vice versa (cf. also [[Random variable|Random variable]]). Thus, under contiguity, convergence in probability of sequences of random variables under <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021046.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021047.png" /> are equivalent and the limits are the same. Actually, contiguity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021048.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021049.png" /> is determined by the behaviour of the sequences of probability measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021050.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021051.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021052.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021053.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021054.png" />. As explained above, there is no loss in generality by supposing that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021055.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021056.png" /> are mutually absolutely continuous for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021057.png" />, and thus the log-likelihood function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021058.png" /> is well-defined with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021059.png" />-probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021060.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021061.png" />. Then, e.g., <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021062.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021063.png" /> are contiguous if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021064.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021065.png" /> are relatively compact, or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021066.png" /> is relatively compact and for every subsequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021067.png" /> converging weakly to a probability measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021068.png" />, one has <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021069.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021070.png" /> is a dummy variable. It should be noted at this point that, under contiguity, the asymptotic distributions, under <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021071.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021072.png" />, of the likelihood (or log-likelihood) ratios <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021073.png" /> are non-degenerate and distinct. Therefore, the statistical problem of choosing between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021074.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021075.png" /> is non-trivial for all sufficiently large <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021076.png" />.
+
Alternative characterizations of contiguity are provided in [[#References|[a11]]], Def. 2.1; Prop. 3.1; Prop. 6.1. In terms of sequences of random variables $\{ T _ { n } \}$, two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are contiguous if $T _ { n } \rightarrow 0$ in $P_n$-probability implies $T _ { n } \rightarrow 0$ in $P _ { n } ^ { \prime }$-probability, and vice versa (cf. also [[Random variable|Random variable]]). Thus, under contiguity, convergence in probability of sequences of random variables under $P_n$ and $P _ { n } ^ { \prime }$ are equivalent and the limits are the same. Actually, contiguity of $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ is determined by the behaviour of the sequences of probability measures $\{ \mathcal{L} _ { n } \}$ and $\{ {\cal L} _ { n } ^ { \prime } \}$, where $\mathcal{L} _ { n } = \mathcal{L} ( \Lambda _ { n } | P _ { n } )$, $\mathcal{L} _ { n } ^ { \prime } = \mathcal{L} ( \Lambda _ { n } | P _ { n } ^ { \prime } )$ and $\Lambda _ { n } = \operatorname { log } ( d P _ { n } ^ { \prime } / d P _ { n } )$. As explained above, there is no loss in generality by supposing that $P_n$ and $P _ { n } ^ { \prime }$ are mutually absolutely continuous for all $n$, and thus the log-likelihood function $\Lambda _ { n }$ is well-defined with $P_n$-probability $1$ for all $n$. Then, e.g., $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are contiguous if and only if $\{ \mathcal{L} _ { n } \}$ and $\{ {\cal L} _ { n } ^ { \prime } \}$ are relatively compact, or $\{ \mathcal{L} _ { n } \}$ is relatively compact and for every subsequence $\{ \mathcal{L} _ { m } \}$ converging weakly to a probability measure $\mathcal{L}$, one has $\int \operatorname { exp } \lambda d \mathcal{L} = 1$, where $\lambda$ is a dummy variable. It should be noted at this point that, under contiguity, the asymptotic distributions, under $P_n$ and $P _ { n } ^ { \prime }$, of the likelihood (or log-likelihood) ratios $d P _ { n } ^ { \prime } / d P_n$ are non-degenerate and distinct. Therefore, the statistical problem of choosing between $P_n$ and $P _ { n } ^ { \prime }$ is non-trivial for all sufficiently large $n$.
  
An important consequence of contiguity is the following. With <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021077.png" /> as above, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021078.png" /> be a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021079.png" />-dimensional random vector such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021080.png" />, a probability measure (where  ""  stands for [[Weak convergence of probability measures|weak convergence of probability measures]]). Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021081.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021082.png" /> is determined by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021083.png" />. In particular, one may determine the asymptotic distribution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021084.png" /> under (the alternative hypothesis) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021085.png" /> in terms of the asymptotic distribution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021086.png" /> under (the null hypothesis) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021087.png" />. Typically, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021088.png" /> and then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021089.png" /> for some <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021090.png" />. Also, if it so happens that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021091.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021092.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021093.png" />-probability for every <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021094.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021095.png" /> (where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021096.png" /> denotes transpose and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021097.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021098.png" /> positive-definite covariance matrix), then, under contiguity again, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c12021099.png" />.
+
An important consequence of contiguity is the following. With $\Lambda _ { n }$ as above, let $T _ { n }$ be a $k$-dimensional random vector such that $\mathcal{L} [ ( \Lambda _ { n } , T _ { n } ) | P _ { n } ] \Rightarrow \tilde{\mathcal{L}}$, a probability measure (where  ""  stands for [[Weak convergence of probability measures|weak convergence of probability measures]]). Then $\mathcal{L} [ ( \Lambda _ { n } , T _ { n } ) | P _ { n } ^ { \prime } ] \Rightarrow \tilde{\mathcal{L}} ^ { \prime }$ and $\widetilde{ \cal L}'$ is determined by $d \tilde{L}  ^ { \prime } / d \tilde{L} = \operatorname { exp } \lambda$. In particular, one may determine the asymptotic distribution of $\Lambda _ { n }$ under (the alternative hypothesis) $P _ { n } ^ { \prime }$ in terms of the asymptotic distribution of $\Lambda _ { n }$ under (the null hypothesis) $P_n$. Typically, $\mathcal{L} ( \Lambda _ { n } | P _ { n } ) \Rightarrow N ( - \sigma ^ { 2 } / 2 , \sigma ^ { 2 } )$ and then $\mathcal{L} ( \Lambda _ { n } | P _ { n } ^ { \prime } ) \Rightarrow N ( \sigma ^ { 2 } / 2 , \sigma ^ { 2 } )$ for some $\sigma &gt; 0$. Also, if it so happens that $\mathcal{L} ( T _ { n } | P _ { n } ) \Rightarrow N ( 0 , \Gamma )$ and $\Lambda _ { n } - h ^ { \prime } T _ { n } \rightarrow - h ^ { \prime } \Gamma h / 2$ in $P_n$-probability for every $h$ in $\mathbf{R} ^ { k }$ (where $\square '$ denotes transpose and $\Gamma$ is a $k \times k$ positive-definite covariance matrix), then, under contiguity again, $\mathcal{L} ( T _ { n } | P _ { n } ^ { \prime } ) \Rightarrow N ( \Gamma h , \Gamma )$.
  
In the context of parametric models in statistics, contiguity results avail themselves in expanding (in the probability sense) a certain log-likelihood function, in obtaining its asymptotic distribution, in approximating the given family of probability measures by exponential probability measures in the neighbourhood of a parameter point, and in obtaining a convolution representation of the limiting probability measure of the distributions of certain estimates. All these results may then be exploited in deriving asymptotically optimal tests for certain statistical hypotheses testing problems (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), and in studying the asymptotic efficiency (cf. also [[Efficiency, asymptotic|Efficiency, asymptotic]]) of estimates. In such a framework, random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210100.png" /> are defined on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210101.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210102.png" /> is a probability measure defined on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210103.png" /> and depending on the parameter <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210104.png" />, an open subset in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210105.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210106.png" /> is the restriction of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210107.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210108.png" />, and the probability measures of interest are usually <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210109.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210110.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210111.png" />. Under certain regularity conditions, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210112.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210113.png" /> are contiguous. The log-likelihood function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210114.png" /> expands in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210115.png" /> (and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210116.png" />-probability); thus:
+
In the context of parametric models in statistics, contiguity results avail themselves in expanding (in the probability sense) a certain log-likelihood function, in obtaining its asymptotic distribution, in approximating the given family of probability measures by exponential probability measures in the neighbourhood of a parameter point, and in obtaining a convolution representation of the limiting probability measure of the distributions of certain estimates. All these results may then be exploited in deriving asymptotically optimal tests for certain statistical hypotheses testing problems (cf. [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), and in studying the asymptotic efficiency (cf. also [[Efficiency, asymptotic|Efficiency, asymptotic]]) of estimates. In such a framework, random variables $X _ { 0 } , \dots , X _ { n }$ are defined on $( \mathcal{X} , \mathcal{A} )$, $P _ { \theta }$ is a probability measure defined on $\mathcal{A}$ and depending on the parameter $\theta \in \Theta$, an open subset in $\mathbf{R} ^ { k }$, $P _ { n , \theta }$ is the restriction of $P _ { \theta }$ to $\mathcal{A} _ { n } = \sigma ( X _ { 0 } , \dots , X _ { n } )$, and the probability measures of interest are usually $P _ { n , \theta }$ and $P _ { n , \theta _ { n } }$, $\theta _ { n } = \theta + h / \sqrt { n }$. Under certain regularity conditions, $\{ P _ { n  , \theta } \}$ and $\{ P _ { n  , \theta _ { n }} \}$ are contiguous. The log-likelihood function $\Lambda _ { n } ( \theta ) = \operatorname { log } ( d P _ { n , \theta _ { n } } / P _ { n , \theta } )$ expands in $P _ { n , \theta }$ (and $P _ { n , \theta _ { n } }$-probability); thus:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210117.png" /></td> </tr></table>
+
\begin{equation*} \Lambda _ { n } ( \theta ) - h ^ { \prime } \Delta _ { n } ( \theta ) \rightarrow - \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h, \end{equation*}
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210118.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210119.png" />-dimensional random vector defined in terms of the derivative of an underlying probability density function, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210120.png" /> is a covariance function. Furthermore,
+
where $\Delta _ { n } ( \theta )$ is a $k$-dimensional random vector defined in terms of the derivative of an underlying probability density function, and $\Gamma ( \theta )$ is a covariance function. Furthermore,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210121.png" /></td> </tr></table>
+
\begin{equation*} \mathcal{L} [ \Delta _ { n } ( \theta ) | P _ { n , \theta } ] \Rightarrow N ( 0 , \Gamma ( \theta ) ), \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210122.png" /></td> </tr></table>
+
\begin{equation*} \mathcal{L} [ \Lambda _ { n } ( \theta ) | P _ { n , \theta } ] \Rightarrow N \left( - \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h , h ^ { \prime } \Gamma ( \theta ) h \right) , \mathcal{L} [ \Lambda _ { n } ( \theta ) | P _ { n , \theta _ { n } } ] \Rightarrow N \left( \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h , h ^ { \prime } \Gamma ( \theta ) h \right), \end{equation*}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210123.png" /></td> </tr></table>
+
\begin{equation*} \mathcal{L} [ \Delta _ { n } ( \theta ) | P _ { n , \theta _ { n } } ] \Rightarrow N ( \Gamma ( \theta ) h , \Gamma ( \theta ) ). \end{equation*}
  
In addition, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210124.png" /> uniformly over bounded sets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210125.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210126.png" /> is the normalized version of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210127.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210128.png" /> being a suitably truncated version of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210129.png" />. Finally, for estimates <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210130.png" /> (of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210131.png" />) for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210132.png" />, a probability measure, one has <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210133.png" />, for a specified probability measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210134.png" />. This last result is due to J. Hájek [[#References|[a3]]] (see also [[#References|[a6]]]).
+
In addition, $\| P _ { n , \theta _ { n }} - R _ { n  , h }\| \rightarrow 0$ uniformly over bounded sets of $h$, where $R _ { n , h } ( A )$ is the normalized version of $\int _ { A } \operatorname { exp } ( h ^ { \prime } \Delta _ { n } ^ { * } ( \theta ) ) d P _ { n , \theta }$, $\Delta _ { n } ^ { * } ( \theta )$ being a suitably truncated version of $\Delta _ { n } ( \theta )$. Finally, for estimates $T _ { n }$ (of $\theta$) for which $\mathcal{L} [ \sqrt { n } ( T _ { n } - \theta _ { n } ) | P _ { n , \theta _ { n } } ] \Rightarrow \mathcal{L} ( \theta )$, a probability measure, one has $\mathcal{L} ( \theta ) = N ( 0 , \Gamma ^ { - 1 } ( \theta )  *  \mathcal{L} _ { 2 } ( \theta ) )$, for a specified probability measure $\mathcal{L} _ { 2 } ( \theta )$. This last result is due to J. Hájek [[#References|[a3]]] (see also [[#References|[a6]]]).
  
Contiguity of two sequences of probability measures <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210135.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210136.png" />, as defined above, may be generalized as follows: Replace <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210137.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210138.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210139.png" /> converges to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210140.png" /> non-decreasingly, and replace <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210141.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210142.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210143.png" /> are real numbers tending to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210144.png" /> non-decreasingly. Then, under suitable regularity conditions, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210145.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210146.png" /> are contiguous if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c120/c120210/c120210147.png" /> (see [[#References|[a1]]], Thm. 2.1).
+
Contiguity of two sequences of probability measures $\{ P _ { n  , \theta } \}$ and $\{ P _ { n  , \theta _ { n }} \}$, as defined above, may be generalized as follows: Replace $n$ by $\alpha _ { n }$, where $\{ \alpha _ { n } \} \subseteq \{ n \}$ converges to $\infty$ non-decreasingly, and replace $\theta _ { n }$ by $\theta _ { \tau _ { n } } = \theta + h \tau _ { n } ^ { - 1 / 2 }$, where $0 &lt; \tau _ { n }$ are real numbers tending to $\infty$ non-decreasingly. Then, under suitable regularity conditions, $\{ P _ { \alpha _ { n }  } , \theta \}$ and $\{ P _ { \alpha _ { n } , \theta _ { \tau _ { n } } } \}$ are contiguous if and only if $\alpha _ { n } / \tau _ { n } = O ( 1 )$ (see [[#References|[a1]]], Thm. 2.1).
  
 
Some additional references to contiguity and its statistical applications are [[#References|[a4]]], [[#References|[a5]]], [[#References|[a2]]], [[#References|[a12]]], [[#References|[a10]]].
 
Some additional references to contiguity and its statistical applications are [[#References|[a4]]], [[#References|[a5]]], [[#References|[a2]]], [[#References|[a12]]], [[#References|[a10]]].
  
 
====References====
 
====References====
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  M.G. Akritas,  M.L. Puri,  G.G. Roussas,  "Sample size, parameter rates and contiguity: the i.d.d. case"  ''Commun. Statist. Theor. Meth.'' , '''A8''' :  1  (1979)  pp. 71–83</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  P.E. Greenwood,  A.M. Shiryayey,  "Contiguity and the statistical invariance principle" , Gordon&amp;Breach  (1985)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top">  J. Hájek,  "A characterization of limiting distributions of regular estimates"  ''Z. Wahrscheinlichkeitsth. verw. Gebiete'' , '''14'''  (1970)  pp. 323–330</TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top">  J. Hájek,  Z. Sidak,  "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top">  I.A. Ibragimov,  R.Z. Has'minskii,  "Statistical estimation" , Springer  (1981)</TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top">  N. Inagaki,  "On the limiting distribution of a sequence of estimators with uniformity property"  ''Ann. Inst. Statist. Math.'' , '''22'''  (1970)  pp. 1–13</TD></TR><TR><TD valign="top">[a7]</TD> <TD valign="top">  L. Le Cam,  "Locally asymptotically normal families of distributions"  ''Univ. Calif. Publ. in Statist.'' , '''3'''  (1960)  pp. 37–98</TD></TR><TR><TD valign="top">[a8]</TD> <TD valign="top">  L. Le Cam,  "Asymptotic methods in statistical decision theory" , Springer  (1986)</TD></TR><TR><TD valign="top">[a9]</TD> <TD valign="top">  L. Le Cam,  G.L. Yang,  "Asymptotics in statistics: some basic concepts" , Springer  (1990)</TD></TR><TR><TD valign="top">[a10]</TD> <TD valign="top">  J. Pfanzagl,  "Parametric statistical inference" , W. de Gruyter  (1994)</TD></TR><TR><TD valign="top">[a11]</TD> <TD valign="top">  G.G. Roussas,  "Contiguity of probability measures: some applications in statistics" , Cambridge Univ. Press  (1972)</TD></TR><TR><TD valign="top">[a12]</TD> <TD valign="top">  H. Strasser,  "Mathematical theory of statistics" , W. de Gruyter  (1985)</TD></TR></table>
+
<table><tr><td valign="top">[a1]</td> <td valign="top">  M.G. Akritas,  M.L. Puri,  G.G. Roussas,  "Sample size, parameter rates and contiguity: the i.d.d. case"  ''Commun. Statist. Theor. Meth.'' , '''A8''' :  1  (1979)  pp. 71–83</td></tr><tr><td valign="top">[a2]</td> <td valign="top">  P.E. Greenwood,  A.M. Shiryayey,  "Contiguity and the statistical invariance principle" , Gordon&amp;Breach  (1985)</td></tr><tr><td valign="top">[a3]</td> <td valign="top">  J. Hájek,  "A characterization of limiting distributions of regular estimates"  ''Z. Wahrscheinlichkeitsth. verw. Gebiete'' , '''14'''  (1970)  pp. 323–330</td></tr><tr><td valign="top">[a4]</td> <td valign="top">  J. Hájek,  Z. Sidak,  "Theory of rank tests" , Acad. Press  (1967)</td></tr><tr><td valign="top">[a5]</td> <td valign="top">  I.A. Ibragimov,  R.Z. Has'minskii,  "Statistical estimation" , Springer  (1981)</td></tr><tr><td valign="top">[a6]</td> <td valign="top">  N. Inagaki,  "On the limiting distribution of a sequence of estimators with uniformity property"  ''Ann. Inst. Statist. Math.'' , '''22'''  (1970)  pp. 1–13</td></tr><tr><td valign="top">[a7]</td> <td valign="top">  L. Le Cam,  "Locally asymptotically normal families of distributions"  ''Univ. Calif. Publ. in Statist.'' , '''3'''  (1960)  pp. 37–98</td></tr><tr><td valign="top">[a8]</td> <td valign="top">  L. Le Cam,  "Asymptotic methods in statistical decision theory" , Springer  (1986)</td></tr><tr><td valign="top">[a9]</td> <td valign="top">  L. Le Cam,  G.L. Yang,  "Asymptotics in statistics: some basic concepts" , Springer  (1990)</td></tr><tr><td valign="top">[a10]</td> <td valign="top">  J. Pfanzagl,  "Parametric statistical inference" , W. de Gruyter  (1994)</td></tr><tr><td valign="top">[a11]</td> <td valign="top">  G.G. Roussas,  "Contiguity of probability measures: some applications in statistics" , Cambridge Univ. Press  (1972)</td></tr><tr><td valign="top">[a12]</td> <td valign="top">  H. Strasser,  "Mathematical theory of statistics" , W. de Gruyter  (1985)</td></tr></table>

Revision as of 17:03, 1 July 2020

The concept of contiguity was formally introduced and developed by L. Le Cam in [a7]. It refers to sequences of probability measures, and is meant to be a measure of "closeness" or "nearness" of such sequences (cf. also Probability measure). It may also be viewed as a kind of uniform asymptotic mutual absolute continuity of probability measures. Actually, the need for the introduction of such a concept arose as early as 1955 or 1956, and it was at that time that Le Cam selected the name of "contiguity" , with the help of J.D. Esary (see [a9], p. 29).

There are several equivalent characterizations of contiguity, and the following may serve as its definition. Two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are said to be contiguous if for any $A _ { n } \in\mathcal{ A} _ { n }$ for which $P _ { n } ( A _ { n } ) \rightarrow 0$, it also happens that $P _ { n } ^ { \prime } ( A _ { n } ) \rightarrow 0$, and vice versa, where $( {\cal X , A} _ { n } )$ is a sequence of measurable spaces and $P_n$ and $P _ { n } ^ { \prime }$ are measures on $\mathcal{A} _ { n }$. Here and in the sequel, all limits are taken as $n \rightarrow \infty$. It is worth mentioning at this point that contiguity is transitive: If $\{ P _ { n } \}$, $\{ P _ { n } ^ { \prime } \}$ are contiguous and $\{ P _ { n } ^ { \prime } \}$, $\{ P _ { n } ^ { \prime \prime } \}$ are contiguous, then so are $\{ P _ { n } \}$, $\{ P _ { n } ^ { \prime \prime } \}$. Contiguity simplifies many arguments in passing to the limit, and it plays a major role in the asymptotic theory of statistical inference (cf. also Statistical hypotheses, verification of). Thus, contiguity is used in parts of [a8] as a tool of obtaining asymptotic results in an elegant manner; [a9] is a more accessible general reference on contiguity and its usages. In a Markovian framework, contiguity, some related results and selected statistical applications are discussed in [a11]. For illustrative purposes, [a11] can be used as standard reference.

The definition of contiguity calls for its comparison with more familiar modes of "closeness" , such as that based on the $\operatorname {sup}$ (or $L_1$) norm, defined by

\begin{equation*} \| P _ { n } - P _ { n } ^ { \prime } \| = 2 \operatorname { sup } \{ | P _ { n } ( A ) - P _ { n } ^ { \prime } ( A ) | : A \in \mathcal{A} _ { n } \}, \end{equation*}

and also the concept of mutual absolute continuity (cf. also Absolute continuity), $P _ { n } \approx P _ { n } ^ { \prime }$. It is always true that convergence in the $L_1$-norm implies contiguity, but the converse is not true (see, e.g., [a11], p. 12; the special case of Example 3.1(i)). So, contiguity is a weaker measure of "closeness" of two sequences of probability measures than that provided by sup-norm convergence. Also, by means of examples, it may be illustrated that it can happen that $P _ { n } \approx P _ { n } ^ { \prime }$ for all $n$ (i.e., $P _ { n } ( A ) = 0$ if and only if $P _ { n } ^ { \prime } ( A ) = 0$ for all $n$, $A \in {\cal A} _ { n }$) whereas $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are not contiguous (see, e.g., [a11], pp. 9–10; Example 2.2). That contiguity need not imply absolute continuity for any $n$ is again demonstrated by examples (see, e.g., [a11], p. 9; Example 2.1 and Remark 2.3). This should not come as a surprise, since contiguity is interpreted as asymptotic absolute continuity rather than absolute continuity for any finite $n$. It is to be noted, however, that a pair of contiguous sequences of probability measures can always be replaced by another pair of contiguous sequences whose respective members are mutually absolutely continuous and lie arbitrarily close to the given ones in the sup-norm sense (see, e.g., [a11], p. 25–26; Thm. 5.1).

The concept exactly opposite to contiguity is that of (asymptotic) entire separation. Thus, two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are said to be (asymptotically) entirely separated if there exist $\{ m \} \subseteq \{ n \}$ and such that $P _ { m } ( A _ { m } ) \rightarrow 0$ whereas $P _ { m } ^ { \prime } ( A _ { m } ) \rightarrow 1$ as $m \rightarrow \infty$ (see [a2], p. 24).

Alternative characterizations of contiguity are provided in [a11], Def. 2.1; Prop. 3.1; Prop. 6.1. In terms of sequences of random variables $\{ T _ { n } \}$, two sequences $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are contiguous if $T _ { n } \rightarrow 0$ in $P_n$-probability implies $T _ { n } \rightarrow 0$ in $P _ { n } ^ { \prime }$-probability, and vice versa (cf. also Random variable). Thus, under contiguity, convergence in probability of sequences of random variables under $P_n$ and $P _ { n } ^ { \prime }$ are equivalent and the limits are the same. Actually, contiguity of $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ is determined by the behaviour of the sequences of probability measures $\{ \mathcal{L} _ { n } \}$ and $\{ {\cal L} _ { n } ^ { \prime } \}$, where $\mathcal{L} _ { n } = \mathcal{L} ( \Lambda _ { n } | P _ { n } )$, $\mathcal{L} _ { n } ^ { \prime } = \mathcal{L} ( \Lambda _ { n } | P _ { n } ^ { \prime } )$ and $\Lambda _ { n } = \operatorname { log } ( d P _ { n } ^ { \prime } / d P _ { n } )$. As explained above, there is no loss in generality by supposing that $P_n$ and $P _ { n } ^ { \prime }$ are mutually absolutely continuous for all $n$, and thus the log-likelihood function $\Lambda _ { n }$ is well-defined with $P_n$-probability $1$ for all $n$. Then, e.g., $\{ P _ { n } \}$ and $\{ P _ { n } ^ { \prime } \}$ are contiguous if and only if $\{ \mathcal{L} _ { n } \}$ and $\{ {\cal L} _ { n } ^ { \prime } \}$ are relatively compact, or $\{ \mathcal{L} _ { n } \}$ is relatively compact and for every subsequence $\{ \mathcal{L} _ { m } \}$ converging weakly to a probability measure $\mathcal{L}$, one has $\int \operatorname { exp } \lambda d \mathcal{L} = 1$, where $\lambda$ is a dummy variable. It should be noted at this point that, under contiguity, the asymptotic distributions, under $P_n$ and $P _ { n } ^ { \prime }$, of the likelihood (or log-likelihood) ratios $d P _ { n } ^ { \prime } / d P_n$ are non-degenerate and distinct. Therefore, the statistical problem of choosing between $P_n$ and $P _ { n } ^ { \prime }$ is non-trivial for all sufficiently large $n$.

An important consequence of contiguity is the following. With $\Lambda _ { n }$ as above, let $T _ { n }$ be a $k$-dimensional random vector such that $\mathcal{L} [ ( \Lambda _ { n } , T _ { n } ) | P _ { n } ] \Rightarrow \tilde{\mathcal{L}}$, a probability measure (where "" stands for weak convergence of probability measures). Then $\mathcal{L} [ ( \Lambda _ { n } , T _ { n } ) | P _ { n } ^ { \prime } ] \Rightarrow \tilde{\mathcal{L}} ^ { \prime }$ and $\widetilde{ \cal L}'$ is determined by $d \tilde{L} ^ { \prime } / d \tilde{L} = \operatorname { exp } \lambda$. In particular, one may determine the asymptotic distribution of $\Lambda _ { n }$ under (the alternative hypothesis) $P _ { n } ^ { \prime }$ in terms of the asymptotic distribution of $\Lambda _ { n }$ under (the null hypothesis) $P_n$. Typically, $\mathcal{L} ( \Lambda _ { n } | P _ { n } ) \Rightarrow N ( - \sigma ^ { 2 } / 2 , \sigma ^ { 2 } )$ and then $\mathcal{L} ( \Lambda _ { n } | P _ { n } ^ { \prime } ) \Rightarrow N ( \sigma ^ { 2 } / 2 , \sigma ^ { 2 } )$ for some $\sigma > 0$. Also, if it so happens that $\mathcal{L} ( T _ { n } | P _ { n } ) \Rightarrow N ( 0 , \Gamma )$ and $\Lambda _ { n } - h ^ { \prime } T _ { n } \rightarrow - h ^ { \prime } \Gamma h / 2$ in $P_n$-probability for every $h$ in $\mathbf{R} ^ { k }$ (where $\square '$ denotes transpose and $\Gamma$ is a $k \times k$ positive-definite covariance matrix), then, under contiguity again, $\mathcal{L} ( T _ { n } | P _ { n } ^ { \prime } ) \Rightarrow N ( \Gamma h , \Gamma )$.

In the context of parametric models in statistics, contiguity results avail themselves in expanding (in the probability sense) a certain log-likelihood function, in obtaining its asymptotic distribution, in approximating the given family of probability measures by exponential probability measures in the neighbourhood of a parameter point, and in obtaining a convolution representation of the limiting probability measure of the distributions of certain estimates. All these results may then be exploited in deriving asymptotically optimal tests for certain statistical hypotheses testing problems (cf. Statistical hypotheses, verification of), and in studying the asymptotic efficiency (cf. also Efficiency, asymptotic) of estimates. In such a framework, random variables $X _ { 0 } , \dots , X _ { n }$ are defined on $( \mathcal{X} , \mathcal{A} )$, $P _ { \theta }$ is a probability measure defined on $\mathcal{A}$ and depending on the parameter $\theta \in \Theta$, an open subset in $\mathbf{R} ^ { k }$, $P _ { n , \theta }$ is the restriction of $P _ { \theta }$ to $\mathcal{A} _ { n } = \sigma ( X _ { 0 } , \dots , X _ { n } )$, and the probability measures of interest are usually $P _ { n , \theta }$ and $P _ { n , \theta _ { n } }$, $\theta _ { n } = \theta + h / \sqrt { n }$. Under certain regularity conditions, $\{ P _ { n , \theta } \}$ and $\{ P _ { n , \theta _ { n }} \}$ are contiguous. The log-likelihood function $\Lambda _ { n } ( \theta ) = \operatorname { log } ( d P _ { n , \theta _ { n } } / P _ { n , \theta } )$ expands in $P _ { n , \theta }$ (and $P _ { n , \theta _ { n } }$-probability); thus:

\begin{equation*} \Lambda _ { n } ( \theta ) - h ^ { \prime } \Delta _ { n } ( \theta ) \rightarrow - \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h, \end{equation*}

where $\Delta _ { n } ( \theta )$ is a $k$-dimensional random vector defined in terms of the derivative of an underlying probability density function, and $\Gamma ( \theta )$ is a covariance function. Furthermore,

\begin{equation*} \mathcal{L} [ \Delta _ { n } ( \theta ) | P _ { n , \theta } ] \Rightarrow N ( 0 , \Gamma ( \theta ) ), \end{equation*}

\begin{equation*} \mathcal{L} [ \Lambda _ { n } ( \theta ) | P _ { n , \theta } ] \Rightarrow N \left( - \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h , h ^ { \prime } \Gamma ( \theta ) h \right) , \mathcal{L} [ \Lambda _ { n } ( \theta ) | P _ { n , \theta _ { n } } ] \Rightarrow N \left( \frac { 1 } { 2 } h ^ { \prime } \Gamma ( \theta ) h , h ^ { \prime } \Gamma ( \theta ) h \right), \end{equation*}

\begin{equation*} \mathcal{L} [ \Delta _ { n } ( \theta ) | P _ { n , \theta _ { n } } ] \Rightarrow N ( \Gamma ( \theta ) h , \Gamma ( \theta ) ). \end{equation*}

In addition, $\| P _ { n , \theta _ { n }} - R _ { n , h }\| \rightarrow 0$ uniformly over bounded sets of $h$, where $R _ { n , h } ( A )$ is the normalized version of $\int _ { A } \operatorname { exp } ( h ^ { \prime } \Delta _ { n } ^ { * } ( \theta ) ) d P _ { n , \theta }$, $\Delta _ { n } ^ { * } ( \theta )$ being a suitably truncated version of $\Delta _ { n } ( \theta )$. Finally, for estimates $T _ { n }$ (of $\theta$) for which $\mathcal{L} [ \sqrt { n } ( T _ { n } - \theta _ { n } ) | P _ { n , \theta _ { n } } ] \Rightarrow \mathcal{L} ( \theta )$, a probability measure, one has $\mathcal{L} ( \theta ) = N ( 0 , \Gamma ^ { - 1 } ( \theta ) * \mathcal{L} _ { 2 } ( \theta ) )$, for a specified probability measure $\mathcal{L} _ { 2 } ( \theta )$. This last result is due to J. Hájek [a3] (see also [a6]).

Contiguity of two sequences of probability measures $\{ P _ { n , \theta } \}$ and $\{ P _ { n , \theta _ { n }} \}$, as defined above, may be generalized as follows: Replace $n$ by $\alpha _ { n }$, where $\{ \alpha _ { n } \} \subseteq \{ n \}$ converges to $\infty$ non-decreasingly, and replace $\theta _ { n }$ by $\theta _ { \tau _ { n } } = \theta + h \tau _ { n } ^ { - 1 / 2 }$, where $0 < \tau _ { n }$ are real numbers tending to $\infty$ non-decreasingly. Then, under suitable regularity conditions, $\{ P _ { \alpha _ { n } } , \theta \}$ and $\{ P _ { \alpha _ { n } , \theta _ { \tau _ { n } } } \}$ are contiguous if and only if $\alpha _ { n } / \tau _ { n } = O ( 1 )$ (see [a1], Thm. 2.1).

Some additional references to contiguity and its statistical applications are [a4], [a5], [a2], [a12], [a10].

References

[a1] M.G. Akritas, M.L. Puri, G.G. Roussas, "Sample size, parameter rates and contiguity: the i.d.d. case" Commun. Statist. Theor. Meth. , A8 : 1 (1979) pp. 71–83
[a2] P.E. Greenwood, A.M. Shiryayey, "Contiguity and the statistical invariance principle" , Gordon&Breach (1985)
[a3] J. Hájek, "A characterization of limiting distributions of regular estimates" Z. Wahrscheinlichkeitsth. verw. Gebiete , 14 (1970) pp. 323–330
[a4] J. Hájek, Z. Sidak, "Theory of rank tests" , Acad. Press (1967)
[a5] I.A. Ibragimov, R.Z. Has'minskii, "Statistical estimation" , Springer (1981)
[a6] N. Inagaki, "On the limiting distribution of a sequence of estimators with uniformity property" Ann. Inst. Statist. Math. , 22 (1970) pp. 1–13
[a7] L. Le Cam, "Locally asymptotically normal families of distributions" Univ. Calif. Publ. in Statist. , 3 (1960) pp. 37–98
[a8] L. Le Cam, "Asymptotic methods in statistical decision theory" , Springer (1986)
[a9] L. Le Cam, G.L. Yang, "Asymptotics in statistics: some basic concepts" , Springer (1990)
[a10] J. Pfanzagl, "Parametric statistical inference" , W. de Gruyter (1994)
[a11] G.G. Roussas, "Contiguity of probability measures: some applications in statistics" , Cambridge Univ. Press (1972)
[a12] H. Strasser, "Mathematical theory of statistics" , W. de Gruyter (1985)
How to Cite This Entry:
Contiguity of probability measures. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Contiguity_of_probability_measures&oldid=50484
This article was adapted from an original article by George G. Roussas (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article