Difference between revisions of "Strong law of large numbers"
(MSC|60F15 Category:Limit theorems) |
(refs format) |
||
Line 21: | Line 21: | ||
for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058010.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058011.png" />. The converse need not be true. For example, if the random variables (1) are independent and, for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058012.png" />, assume the two values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058013.png" /> with probability 1/2 each, they satisfy the law of large numbers (4) for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058014.png" />, but the strong law of large numbers (2) is not satisfied for any value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058015.png" />. The existence of such examples is not at all obvious at first sight. The reason is that even though, in general, convergence in probability is weaker than convergence with probability one, nevertheless the two types of convergence are equivalent, for example, in the case of series of independent random variables. | for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058010.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058011.png" />. The converse need not be true. For example, if the random variables (1) are independent and, for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058012.png" />, assume the two values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058013.png" /> with probability 1/2 each, they satisfy the law of large numbers (4) for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058014.png" />, but the strong law of large numbers (2) is not satisfied for any value of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058015.png" />. The existence of such examples is not at all obvious at first sight. The reason is that even though, in general, convergence in probability is weaker than convergence with probability one, nevertheless the two types of convergence are equivalent, for example, in the case of series of independent random variables. | ||
− | The strong law of large numbers was first formulated and demonstrated by E. Borel | + | The strong law of large numbers was first formulated and demonstrated by E. Borel {{Cite|B}} for the Bernoulli scheme in the number-theoretic interpretation; cf. [[Borel strong law of large numbers|Borel strong law of large numbers]]. Special cases of the Bernoulli scheme result from the expansion of a real number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058016.png" />, taken at random (with uniform distribution) in the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058017.png" />, into an infinite fraction to any basis (see [[Bernoulli trials|Bernoulli trials]]). Thus, in the binary expansion |
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058018.png" /></td> </tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058018.png" /></td> </tr></table> | ||
Line 27: | Line 27: | ||
the successive signs of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058019.png" /> assume two values, 0 and 1, with probability 1/2 each, and are independent random variables. The sum <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058020.png" /> is equal to the number of ones among the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058021.png" /> signs in the binary expansion, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058022.png" /> is their proportion. At the same time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058023.png" /> may be considered as the number of "successful" trials in the Bernoulli scheme with probability of "success" (appearance of "1" ) equal to 1/2. Borel showed that the proportion of "ones" , <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058024.png" />, tends to 1/2 for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058025.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058026.png" />. In a similar manner, in the expansion of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058027.png" /> to the base 10, "success" may be considered as the appearance of any one of the digits <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058028.png" /> (e.g. the number 3). One then obtains a Bernoulli scheme with probability of success 1/10 and the frequency of appearance of the selected digit among the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058029.png" /> signs in the decimal expansion tends to 1/10 for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058030.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058031.png" />. It was also noted by Borel that the frequency of appearance of any given group of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058032.png" /> digits tends to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058033.png" /> for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058034.png" /> (cf. [[Normal number|Normal number]]). | the successive signs of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058019.png" /> assume two values, 0 and 1, with probability 1/2 each, and are independent random variables. The sum <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058020.png" /> is equal to the number of ones among the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058021.png" /> signs in the binary expansion, while <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058022.png" /> is their proportion. At the same time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058023.png" /> may be considered as the number of "successful" trials in the Bernoulli scheme with probability of "success" (appearance of "1" ) equal to 1/2. Borel showed that the proportion of "ones" , <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058024.png" />, tends to 1/2 for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058025.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058026.png" />. In a similar manner, in the expansion of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058027.png" /> to the base 10, "success" may be considered as the appearance of any one of the digits <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058028.png" /> (e.g. the number 3). One then obtains a Bernoulli scheme with probability of success 1/10 and the frequency of appearance of the selected digit among the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058029.png" /> signs in the decimal expansion tends to 1/10 for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058030.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058031.png" />. It was also noted by Borel that the frequency of appearance of any given group of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058032.png" /> digits tends to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058033.png" /> for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058034.png" /> (cf. [[Normal number|Normal number]]). | ||
− | F. Cantelli | + | F. Cantelli {{Cite|C}} stated sufficient conditions for the strong law of large numbers for independent random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058035.png" /> in terms of the second and fourth central moments of the summands (these conditions are fulfilled for the Bernoulli scheme). Putting |
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058036.png" /></td> </tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058036.png" /></td> </tr></table> | ||
Line 45: | Line 45: | ||
with probability one, i.e. (3) is valid. Borel estimated the terms of the series (5) by the de Moivre–Laplace theorem, while Cantelli did so basing himself on Chebyshev's inequality with fourth moments. | with probability one, i.e. (3) is valid. Borel estimated the terms of the series (5) by the de Moivre–Laplace theorem, while Cantelli did so basing himself on Chebyshev's inequality with fourth moments. | ||
− | A further extension of the conditions for the applicability of the strong law of large numbers was realized by A.Ya. Khinchin and A.N. Kolmogorov. Khinchin | + | A further extension of the conditions for the applicability of the strong law of large numbers was realized by A.Ya. Khinchin and A.N. Kolmogorov. Khinchin {{Cite|Kh}}, {{Cite|Kh2}} introduced the very name of "strong law of large numbers" and proved a sufficient condition (also applicable to dependent variables) for it with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058043.png" />. Denoting by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058044.png" /> the correlation coefficient between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058045.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058046.png" /> and putting |
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058047.png" /></td> </tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058047.png" /></td> </tr></table> | ||
Line 75: | Line 75: | ||
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058069.png" /></td> <td valign="top" style="width:5%;text-align:right;">(7)</td></tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058069.png" /></td> <td valign="top" style="width:5%;text-align:right;">(7)</td></tr></table> | ||
− | where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058070.png" /> is the median of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058071.png" /> | + | where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058070.png" /> is the median of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058071.png" /> {{Cite|Pr}}. If additional restrictions are imposed, (7) will yield conditions expressed in terms of the characteristics of the individual terms. For instance, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058072.png" /> or if all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058073.png" /> are normally distributed, condition (7) is equivalent to the following one: For any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058074.png" />, |
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058075.png" /></td> <td valign="top" style="width:5%;text-align:right;">(8)</td></tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058075.png" /></td> <td valign="top" style="width:5%;text-align:right;">(8)</td></tr></table> | ||
Line 83: | Line 83: | ||
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058077.png" /></td> </tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058077.png" /></td> </tr></table> | ||
− | Conditions of applicability of the strong law of large numbers to Markov chains and processes, and to stationary processes, are known | + | Conditions of applicability of the strong law of large numbers to Markov chains and processes, and to stationary processes, are known {{Cite|D}}. Thus, Khinchin's method, which is applicable to a sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058078.png" /> that is stationary in the wide sense, with correlation function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058079.png" />, leads to the following theorem: If the series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058080.png" /> is convergent, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058081.png" /> with probability one. In applications to stationary (in the narrow sense) random processes the name "strong law of large numbers" is sometimes given to the following statement: The limit |
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058082.png" /></td> </tr></table> | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058082.png" /></td> </tr></table> | ||
Line 89: | Line 89: | ||
exists with probability one. The random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058083.png" /> is equal to the conditional mathematical expectation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058084.png" /> with respect to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058085.png" />-algebra of sets that are shift-invariant; the variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058086.png" /> is constant and equals <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058087.png" /> with probability one only for metrically-transitive processes. The strong law of large numbers in this form is identical with the [[Birkhoff ergodic theorem|Birkhoff ergodic theorem]]. | exists with probability one. The random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058083.png" /> is equal to the conditional mathematical expectation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058084.png" /> with respect to the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058085.png" />-algebra of sets that are shift-invariant; the variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058086.png" /> is constant and equals <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058087.png" /> with probability one only for metrically-transitive processes. The strong law of large numbers in this form is identical with the [[Birkhoff ergodic theorem|Birkhoff ergodic theorem]]. | ||
− | There exist variations of the strong law of large numbers for random vectors in normed linear spaces | + | There exist variations of the strong law of large numbers for random vectors in normed linear spaces {{Cite|G}}. The chronologically earliest example of such a variation is the Glivenko–Cantelli theorem on the convergence of the empirical distribution function to the theoretical one. |
The deviations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058088.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058089.png" /> are described by the [[Law of the iterated logarithm|law of the iterated logarithm]]. | The deviations of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058088.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/s/s090/s090580/s09058089.png" /> are described by the [[Law of the iterated logarithm|law of the iterated logarithm]]. | ||
====References==== | ====References==== | ||
− | + | {| | |
+ | |valign="top"|{{Ref|B}}|| E. Borel, "Les probabilités dénombrables et leurs applications arithmétique" ''Rend. Circ. Mat. Palermo (2)'' , '''27''' (1909) pp. 247–271 | ||
+ | |- | ||
+ | |valign="top"|{{Ref|C}}|| F.P. Cantelli, "Sulla probabilità come limite della frequenza" ''Atti Accad. Naz. Lincei'' , '''26''' : 1 (1917) pp. 39–45 {{MR|}} {{ZBL|46.0779.02}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|Kh}}|| A.Ya. Khintchine, "Fundamental laws of probability theory" , Moscow (1927) (In Russian) | ||
+ | |- | ||
+ | |valign="top"|{{Ref|Kh2}}|| A.Ya. Khintchine, "Sur la loi forte des grands nombres" ''C.R. Acad. Sci. Paris Ser. I Math.'' , '''186''' (1928) pp. 285–287 {{MR|}} {{ZBL|54.0544.03}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|Ko}}|| A.N. Kolmogorov, "Sur la loi forte des grands nombres" ''C.R. Acad. Sci. Paris Sér. I Math.'' , '''191''' (1930) pp. 910–912 | ||
+ | |- | ||
+ | |valign="top"|{{Ref|Pr}}|| Yu.V. Prokhorov, "On the strong law of large numbers" ''Izv. Akad. Nauk SSSR Ser. Mat.'' , '''14''' (1950) pp. 523–536 (In Russian) {{MR|0121858}} {{ZBL|0089.13903}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|D}}|| J.L. Doob, "Stochastic processes" , Chapman & Hall (1953) {{MR|1570654}} {{MR|0058896}} {{ZBL|0053.26802}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|Pe}}|| V.V. Petrov, "Sums of independent random variables" , Springer (1975) (Translated from Russian) {{MR|0388499}} {{ZBL|0322.60043}} {{ZBL|0322.60042}} | ||
+ | |- | ||
+ | |valign="top"|{{Ref|G}}|| U. Grenander, "Probabilities on algebraic structures" , Wiley (1963) {{MR|0206994}} {{ZBL|0131.34804}} | ||
+ | |} |
Revision as of 18:05, 30 May 2012
2010 Mathematics Subject Classification: Primary: 60F15 [MSN][ZBL]
A form of the law of large numbers (in its general form) which states that, under certain conditions, the arithmetical averages of a sequence of random variables tend to certain constant values with probability one. More exactly, let
(1) |
be a sequence of random variables and let . One says that the sequence (1) satisfies the strong law of large numbers if there exists a sequence of constants such that the probability of the relationship
(2) |
is one. Another formulation, which is equivalent to the former one, is as follows: The sequence (1) satisfies the strong law of large numbers if, for any , the probability of all the inequalities
(3) |
to be true at the same time tends to one as . Thus, one considers the behaviour of the sequence of the sums as a whole, whereas in the ordinary law of large numbers individual sums only are considered. If the sequence (1) satisfies the strong law of large numbers, it also satisfies the ordinary law of large numbers for the same , i.e.
(4) |
for any if . The converse need not be true. For example, if the random variables (1) are independent and, for , assume the two values with probability 1/2 each, they satisfy the law of large numbers (4) for , but the strong law of large numbers (2) is not satisfied for any value of . The existence of such examples is not at all obvious at first sight. The reason is that even though, in general, convergence in probability is weaker than convergence with probability one, nevertheless the two types of convergence are equivalent, for example, in the case of series of independent random variables.
The strong law of large numbers was first formulated and demonstrated by E. Borel [B] for the Bernoulli scheme in the number-theoretic interpretation; cf. Borel strong law of large numbers. Special cases of the Bernoulli scheme result from the expansion of a real number , taken at random (with uniform distribution) in the interval , into an infinite fraction to any basis (see Bernoulli trials). Thus, in the binary expansion
the successive signs of assume two values, 0 and 1, with probability 1/2 each, and are independent random variables. The sum is equal to the number of ones among the first signs in the binary expansion, while is their proportion. At the same time may be considered as the number of "successful" trials in the Bernoulli scheme with probability of "success" (appearance of "1" ) equal to 1/2. Borel showed that the proportion of "ones" , , tends to 1/2 for almost-all in . In a similar manner, in the expansion of to the base 10, "success" may be considered as the appearance of any one of the digits (e.g. the number 3). One then obtains a Bernoulli scheme with probability of success 1/10 and the frequency of appearance of the selected digit among the first signs in the decimal expansion tends to 1/10 for almost-all in . It was also noted by Borel that the frequency of appearance of any given group of digits tends to for almost-all (cf. Normal number).
F. Cantelli [C] stated sufficient conditions for the strong law of large numbers for independent random variables in terms of the second and fourth central moments of the summands (these conditions are fulfilled for the Bernoulli scheme). Putting
the Cantelli condition can be written in the form
The proofs of Cantelli and Borel are based on the following reasoning. Let, for some sequence of positive numbers (when ),
(5) |
Then, according to the Borel–Cantelli lemma, only a finite number of events under the probability sign in (5) are realized with probability one. Accordingly, for all sufficiently large ,
with probability one, i.e. (3) is valid. Borel estimated the terms of the series (5) by the de Moivre–Laplace theorem, while Cantelli did so basing himself on Chebyshev's inequality with fourth moments.
A further extension of the conditions for the applicability of the strong law of large numbers was realized by A.Ya. Khinchin and A.N. Kolmogorov. Khinchin [Kh], [Kh2] introduced the very name of "strong law of large numbers" and proved a sufficient condition (also applicable to dependent variables) for it with . Denoting by the correlation coefficient between and and putting
one can write Khinchin's condition as follows: for some . In fact, the statement which follows from Khinchin's proof is much stronger.
In the case of independent summands the best known conditions for applicability of the strong law of large numbers are those established by Kolmogorov: Sufficient (1930) for variables with a finite variance and necessary and sufficient conditions (1933) for identically-distributed variables consist of the existence of the mathematical expectation of the variables . Kolmogorov's theorem for random variables (1) with finite variances states that the condition
(6) |
implies the applicability of the strong law of large numbers with for the sequence (1). In terms of variances condition (6) is best in the sense that for any sequence of positive numbers such that the series diverges it is possible to construct a sequence of independent random variables with which does not satisfy the strong law of large numbers. The scope of application of condition (6) (and also of other conditions of the strong law of large numbers for independent variables) may be extended as follows. Let be the median of . The convergence of the series
is necessary in the strong law of large numbers. It follows from the Borel–Cantelli lemma that with probability one, starting from a certain number . Accordingly, in a study on the conditions of applicability of the strong law of large numbers, one may immediately restrict to random variables which satisfy the last-named condition.
In the proof given by Khinchin and Kolmogorov the convergence of the series (5) is replaced by the convergence of the series
where . In this proof, Khinchin actually employed a number of ideas from the theory of orthogonal series of functions, while Kolmogorov employed his inequality for the maxima of sums of random variables.
It is possible to state a necessary and sufficient condition for the strong law of large numbers for independent random variables. Putting
where the sum applies to the values of for which , this condition may be written as follows: For any ,
(7) |
where is the median of [Pr]. If additional restrictions are imposed, (7) will yield conditions expressed in terms of the characteristics of the individual terms. For instance, if or if all are normally distributed, condition (7) is equivalent to the following one: For any ,
(8) |
Here, since are independent,
Conditions of applicability of the strong law of large numbers to Markov chains and processes, and to stationary processes, are known [D]. Thus, Khinchin's method, which is applicable to a sequence that is stationary in the wide sense, with correlation function , leads to the following theorem: If the series is convergent, then with probability one. In applications to stationary (in the narrow sense) random processes the name "strong law of large numbers" is sometimes given to the following statement: The limit
exists with probability one. The random variable is equal to the conditional mathematical expectation of with respect to the -algebra of sets that are shift-invariant; the variable is constant and equals with probability one only for metrically-transitive processes. The strong law of large numbers in this form is identical with the Birkhoff ergodic theorem.
There exist variations of the strong law of large numbers for random vectors in normed linear spaces [G]. The chronologically earliest example of such a variation is the Glivenko–Cantelli theorem on the convergence of the empirical distribution function to the theoretical one.
The deviations of from are described by the law of the iterated logarithm.
References
[B] | E. Borel, "Les probabilités dénombrables et leurs applications arithmétique" Rend. Circ. Mat. Palermo (2) , 27 (1909) pp. 247–271 |
[C] | F.P. Cantelli, "Sulla probabilità come limite della frequenza" Atti Accad. Naz. Lincei , 26 : 1 (1917) pp. 39–45 Zbl 46.0779.02 |
[Kh] | A.Ya. Khintchine, "Fundamental laws of probability theory" , Moscow (1927) (In Russian) |
[Kh2] | A.Ya. Khintchine, "Sur la loi forte des grands nombres" C.R. Acad. Sci. Paris Ser. I Math. , 186 (1928) pp. 285–287 Zbl 54.0544.03 |
[Ko] | A.N. Kolmogorov, "Sur la loi forte des grands nombres" C.R. Acad. Sci. Paris Sér. I Math. , 191 (1930) pp. 910–912 |
[Pr] | Yu.V. Prokhorov, "On the strong law of large numbers" Izv. Akad. Nauk SSSR Ser. Mat. , 14 (1950) pp. 523–536 (In Russian) MR0121858 Zbl 0089.13903 |
[D] | J.L. Doob, "Stochastic processes" , Chapman & Hall (1953) MR1570654 MR0058896 Zbl 0053.26802 |
[Pe] | V.V. Petrov, "Sums of independent random variables" , Springer (1975) (Translated from Russian) MR0388499 Zbl 0322.60043 Zbl 0322.60042 |
[G] | U. Grenander, "Probabilities on algebraic structures" , Wiley (1963) MR0206994 Zbl 0131.34804 |
Strong law of large numbers. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Strong_law_of_large_numbers&oldid=24625