Difference between revisions of "Weak convergence of probability measures"
m (typo) |
(Added discussion of compactness, duality with continuous functions and references to other entries.) |
||
(One intermediate revision by one other user not shown) | |||
Line 4: | Line 4: | ||
See also [[Convergence of measures]]. | See also [[Convergence of measures]]. | ||
− | The general setting for weak convergence of probability measures is that of a complete separable [[Metric space|metric space]] $(X,\rho)$ (cf. also [[Complete space|Complete space]]; [[Separable space|Separable space]]), $\rho$ being the metric, with probability measures $\mu_i$, $i=0,1,\dots$ defined on the Borel sets of $X$. It is said that $\mu_n$ converges weakly to $\mu_0$ in $(X,\rho)$ if for every bounded continuous function $f$ on $X$ one has $\int f\,{\rm}d\mu_n\,\rightarrow\,\int f\,{\rm d}\mu_0$ as $n\rightarrow\infty$. If random elements $\xi_n$, $n=0,1,\dots$ taking values in $X$ are such that the distribution of $\xi_n$ is $\mu_n$, $n=0,1,\dots$ one writes $\xi_n\rightarrow^{d} \xi_0$, and says that $\xi_n$ converges in distribution to $\xi_0$ if $\mu_n$ converges weakly to $\mu_0$ (cf. also [[Convergence in distribution|Convergence in distribution]]). | + | The general setting for weak convergence of probability measures is that of a complete separable [[Metric space|metric space]] $(X,\rho)$ (cf. also [[Complete space|Complete space]]; [[Separable space|Separable space]]), $\rho$ being the metric, with probability measures $\mu_i$, $i=0,1,\dots$ defined on the [[Borel set|Borel sets]] of $X$. |
+ | |||
+ | '''Definition 1''' | ||
+ | It is said that $\mu_n$ converges weakly to $\mu_0$ in $(X,\rho)$ if for every bounded continuous function $f$ on $X$ one has $\int f\,{\rm}d\mu_n\,\rightarrow\,\int f\,{\rm d}\mu_0$ as $n\rightarrow\infty$. | ||
+ | |||
+ | If random elements $\xi_n$, $n=0,1,\dots$ taking values in $X$ are such that the distribution of $\xi_n$ is $\mu_n$, $n=0,1,\dots$ one writes $\xi_n\rightarrow^{d} \xi_0$, and says that $\xi_n$ converges in distribution to $\xi_0$ if $\mu_n$ converges weakly to $\mu_0$ (cf. also [[Convergence in distribution|Convergence in distribution]]). | ||
The metric spaces in most common use in probability are $\mathbb{R}^k$, $k$-dimensional Euclidean space, $C[0,1]$, the space of continuous functions on $[0,1]$, and $D[0,1]$, the space of functions on $[0,1]$ which are right continuous with left-hand limits. | The metric spaces in most common use in probability are $\mathbb{R}^k$, $k$-dimensional Euclidean space, $C[0,1]$, the space of continuous functions on $[0,1]$, and $D[0,1]$, the space of functions on $[0,1]$ which are right continuous with left-hand limits. | ||
− | Weak convergence in a suitably rich metric space is of considerably greater use than that in Euclidean space. This is because a wide variety of results on convergence in distribution on $\mathbb R$ can be derived from it with the aid of the continuous mapping theorem, which states that if $\xi_n\rightarrow^{d}\xi_0$ in $(X,\rho)$ and the mapping $h:X\rightarrow\mathbb R$ is continuous (or at least is measurable and $\mathsf P\{\xi_0\in D_h\}=0$, where $D_h$ is the set of discontinuities of $h$, then $h(\xi_n)\rightarrow^{d}h(\xi_0)$. In many applications the limit random element is [[Brownian motion|Brownian motion]], which has continuous paths with probability one. | + | Weak convergence in a suitably rich metric space is of considerably greater use than that in Euclidean space. This is because a wide variety of results on convergence in distribution on $\mathbb R$ can be derived from it with the aid of the continuous mapping theorem, which states that if $\xi_n\rightarrow^{d}\xi_0$ in $(X,\rho)$ and the mapping $h:X\rightarrow\mathbb R$ is continuous (or at least is measurable and $\mathsf P\{\xi_0\in D_h\}=0$, where $D_h$ is the set of discontinuities of $h$), then $h(\xi_n)\rightarrow^{d}h(\xi_0)$. In many applications the limit random element is [[Brownian motion|Brownian motion]], which has continuous paths with probability one. |
One of the most fundamental weak convergence results is Donsker's theorem for sums $S_n=\sum_{i=1}^n X_i$, $n\ge 1$, of independent and identically-distributed random variables $X_i$ with $\mathsf EX_i=0$, $\mathsf EX_i^2=1$. This can be framed in $C[0,1]$ by setting $S_0=0$ and $S_n(t)=n^{-1/2}\{S_{[nt]}+(nt-[nt])X_{[nt]+1}\}$, $0\leq t\leq 1$, where $[x]$ denotes the integer part of $x$. Then Donsker's theorem asserts that $S_n(t)\rightarrow^{d} W(t)$, where $W(t)$ is standard Brownian motion. Application of the continuous mapping theorem then readily provides convergence-in-distribution results for functionals such as $\max_{1\leq k\leq n} S_k$, $\max_{1\leq k\leq n} k^{-1/2}|S_k|$, $\sum_{k=1}^n I(S_k\geq\alpha)$, and $\sum_{k=1}^n \gamma(S_k,S_{k+1})$, where $I$ is the indicator function and $\gamma(a,b)=1$ if $ab<0$ and $0$ otherwise. | One of the most fundamental weak convergence results is Donsker's theorem for sums $S_n=\sum_{i=1}^n X_i$, $n\ge 1$, of independent and identically-distributed random variables $X_i$ with $\mathsf EX_i=0$, $\mathsf EX_i^2=1$. This can be framed in $C[0,1]$ by setting $S_0=0$ and $S_n(t)=n^{-1/2}\{S_{[nt]}+(nt-[nt])X_{[nt]+1}\}$, $0\leq t\leq 1$, where $[x]$ denotes the integer part of $x$. Then Donsker's theorem asserts that $S_n(t)\rightarrow^{d} W(t)$, where $W(t)$ is standard Brownian motion. Application of the continuous mapping theorem then readily provides convergence-in-distribution results for functionals such as $\max_{1\leq k\leq n} S_k$, $\max_{1\leq k\leq n} k^{-1/2}|S_k|$, $\sum_{k=1}^n I(S_k\geq\alpha)$, and $\sum_{k=1}^n \gamma(S_k,S_{k+1})$, where $I$ is the indicator function and $\gamma(a,b)=1$ if $ab<0$ and $0$ otherwise. | ||
+ | |||
+ | ====Sequential compactness and relations to other types of convergence==== | ||
+ | Let $(X, \rho)$ be a complete metric space. The space $\mathcal{P} (X)$ of probability measures on the $\sigma$-algebra of Borel sets is a closed subspace of the space $\mathcal{M}^b (X)$ of signed Radon measures, i.e. those signed measures on the Borel $\sigma$-algebra whose total variation is a [[Radon measure]] (compare with [[Convergence of measures]]). The notion of convergence of Definition 1 can then be extended to sequences of general signed Radon measures and the corresponding topology is called ''narrow topology'' by some authors. Several other notions of convergence can be introduced on $\mathcal{M}^b (X)$ (and hence on $\mathcal{P} (X)$), see [[Convergence of measures]] for a more detailed account and a comparison between the different notions. | ||
+ | |||
+ | If the metric space $X$ is compact, the [[Riesz representation theorem]] implies that $\mathcal{M}^b (X)$ is the dual of the space $C (X)$ of continuous functions and hence the weak convergence of a sequence of probability measures $\{\mu_n\}\subset \mathcal{P} (X)$ coincides with the weak$^*$ convergence. Under this assumption a very useful fact (which is a consequence of a more general theorem on duals of separable [[Banach space|Banach spaces]]) is that bounded and closed subsets of $\mathcal{M}^b (X)$ are sequentially weak$^*$ compact. Thus, if the metric space $(X,\rho)$ is compact, given any sequence $\{\mu_k\}\subset \mathcal{P} (X)$, | ||
+ | there is a subsequence $\mu_{k_j}$ which converges to some $\mu \in \mathcal{P} (X)$ in the sense of Definition 1. | ||
====References==== | ====References==== |
Latest revision as of 10:38, 23 November 2013
2020 Mathematics Subject Classification: Primary: 60B10 [MSN][ZBL] See also Convergence of measures.
The general setting for weak convergence of probability measures is that of a complete separable metric space $(X,\rho)$ (cf. also Complete space; Separable space), $\rho$ being the metric, with probability measures $\mu_i$, $i=0,1,\dots$ defined on the Borel sets of $X$.
Definition 1 It is said that $\mu_n$ converges weakly to $\mu_0$ in $(X,\rho)$ if for every bounded continuous function $f$ on $X$ one has $\int f\,{\rm}d\mu_n\,\rightarrow\,\int f\,{\rm d}\mu_0$ as $n\rightarrow\infty$.
If random elements $\xi_n$, $n=0,1,\dots$ taking values in $X$ are such that the distribution of $\xi_n$ is $\mu_n$, $n=0,1,\dots$ one writes $\xi_n\rightarrow^{d} \xi_0$, and says that $\xi_n$ converges in distribution to $\xi_0$ if $\mu_n$ converges weakly to $\mu_0$ (cf. also Convergence in distribution).
The metric spaces in most common use in probability are $\mathbb{R}^k$, $k$-dimensional Euclidean space, $C[0,1]$, the space of continuous functions on $[0,1]$, and $D[0,1]$, the space of functions on $[0,1]$ which are right continuous with left-hand limits.
Weak convergence in a suitably rich metric space is of considerably greater use than that in Euclidean space. This is because a wide variety of results on convergence in distribution on $\mathbb R$ can be derived from it with the aid of the continuous mapping theorem, which states that if $\xi_n\rightarrow^{d}\xi_0$ in $(X,\rho)$ and the mapping $h:X\rightarrow\mathbb R$ is continuous (or at least is measurable and $\mathsf P\{\xi_0\in D_h\}=0$, where $D_h$ is the set of discontinuities of $h$), then $h(\xi_n)\rightarrow^{d}h(\xi_0)$. In many applications the limit random element is Brownian motion, which has continuous paths with probability one.
One of the most fundamental weak convergence results is Donsker's theorem for sums $S_n=\sum_{i=1}^n X_i$, $n\ge 1$, of independent and identically-distributed random variables $X_i$ with $\mathsf EX_i=0$, $\mathsf EX_i^2=1$. This can be framed in $C[0,1]$ by setting $S_0=0$ and $S_n(t)=n^{-1/2}\{S_{[nt]}+(nt-[nt])X_{[nt]+1}\}$, $0\leq t\leq 1$, where $[x]$ denotes the integer part of $x$. Then Donsker's theorem asserts that $S_n(t)\rightarrow^{d} W(t)$, where $W(t)$ is standard Brownian motion. Application of the continuous mapping theorem then readily provides convergence-in-distribution results for functionals such as $\max_{1\leq k\leq n} S_k$, $\max_{1\leq k\leq n} k^{-1/2}|S_k|$, $\sum_{k=1}^n I(S_k\geq\alpha)$, and $\sum_{k=1}^n \gamma(S_k,S_{k+1})$, where $I$ is the indicator function and $\gamma(a,b)=1$ if $ab<0$ and $0$ otherwise.
Sequential compactness and relations to other types of convergence
Let $(X, \rho)$ be a complete metric space. The space $\mathcal{P} (X)$ of probability measures on the $\sigma$-algebra of Borel sets is a closed subspace of the space $\mathcal{M}^b (X)$ of signed Radon measures, i.e. those signed measures on the Borel $\sigma$-algebra whose total variation is a Radon measure (compare with Convergence of measures). The notion of convergence of Definition 1 can then be extended to sequences of general signed Radon measures and the corresponding topology is called narrow topology by some authors. Several other notions of convergence can be introduced on $\mathcal{M}^b (X)$ (and hence on $\mathcal{P} (X)$), see Convergence of measures for a more detailed account and a comparison between the different notions.
If the metric space $X$ is compact, the Riesz representation theorem implies that $\mathcal{M}^b (X)$ is the dual of the space $C (X)$ of continuous functions and hence the weak convergence of a sequence of probability measures $\{\mu_n\}\subset \mathcal{P} (X)$ coincides with the weak$^*$ convergence. Under this assumption a very useful fact (which is a consequence of a more general theorem on duals of separable Banach spaces) is that bounded and closed subsets of $\mathcal{M}^b (X)$ are sequentially weak$^*$ compact. Thus, if the metric space $(X,\rho)$ is compact, given any sequence $\{\mu_k\}\subset \mathcal{P} (X)$, there is a subsequence $\mu_{k_j}$ which converges to some $\mu \in \mathcal{P} (X)$ in the sense of Definition 1.
References
[B] | P. Billingsley, "Convergence of probability measures" , Wiley (1968) pp. 9ff MR0233396 Zbl 0172.21201 |
Weak convergence of probability measures. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Weak_convergence_of_probability_measures&oldid=30724