Difference between revisions of "Average sample number"
m (AUTOMATIC EDIT (latexlist): Replaced 59 formulas out of 59 by TEX code with an average confidence of 2.0 and a minimal confidence of 2.0.) |
m (latex details) |
||
Line 13: | Line 13: | ||
If $N$ denotes the sample size and $\theta$ is the unknown parameter, then the average sample number is given by | If $N$ denotes the sample size and $\theta$ is the unknown parameter, then the average sample number is given by | ||
− | \begin{equation} \tag{a1} \mathsf{E} _ { \theta } ( N ) = \sum _ { n = 1 } ^ { \infty } n \mathsf{P} _ { \theta } ( N = n ) = \sum _ { n = 0 } ^ { \infty } \mathsf{P} _ { \theta } ( N | + | \begin{equation} \tag{a1} \mathsf{E} _ { \theta } ( N ) = \sum _ { n = 1 } ^ { \infty } n \mathsf{P} _ { \theta } ( N = n ) = \sum _ { n = 0 } ^ { \infty } \mathsf{P} _ { \theta } ( N > n ). \end{equation} |
Some aspects of the average sample number in the case where the observations are independent Bernoulli random variables $X_i$ with $\mathsf{E} _ { \theta } ( X _ { i } ) = \mathsf{P} _ { \theta } ( X _ { i } = 1 ) = \theta = 1 - \mathsf{P} _ { \theta } ( X _ { i } = 0 )$ are given below. | Some aspects of the average sample number in the case where the observations are independent Bernoulli random variables $X_i$ with $\mathsf{E} _ { \theta } ( X _ { i } ) = \mathsf{P} _ { \theta } ( X _ { i } = 1 ) = \theta = 1 - \mathsf{P} _ { \theta } ( X _ { i } = 0 )$ are given below. | ||
− | Consider the case of the curtailed version of a test of the hypotheses $H _ { 0 } : \theta = 0$ versus $H _ { 1 } : \theta | + | Consider the case of the curtailed version of a test of the hypotheses $H _ { 0 } : \theta = 0$ versus $H _ { 1 } : \theta > 0$ that takes $n$ observations and decides $H _ { 1 }$ if $X _ { 1 } + \ldots + X _ { n } > 0$ (cf. also [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]). The curtailed version of this test is sequential and stops the first time that $X _ { k } = 1$ or at time $n$, whichever comes first. Then the average sample number is given by |
− | \begin{equation} \tag{a2} \mathsf{E} _ { \theta } ( N ) = \sum _ { k = 0 } ^ { n - 1 } \mathsf{P} _ { \theta } ( N | + | \begin{equation} \tag{a2} \mathsf{E} _ { \theta } ( N ) = \sum _ { k = 0 } ^ { n - 1 } \mathsf{P} _ { \theta } ( N > k ) = \sum _ { k = 0 } ^ { n - 1 } ( 1 - \theta ) ^ { k } = \end{equation} |
− | \begin{equation*} = \frac { 1 - ( 1 - \theta ) ^ { n } } { \theta } \text { for } \theta | + | \begin{equation*} = \frac { 1 - ( 1 - \theta ) ^ { n } } { \theta } \text { for } \theta > 0. \end{equation*} |
This formula is a special case of Wald's lemma (see [[#References|[a3]]] and [[Wald identity|Wald identity]]), which is very useful in finding or approximating average sample numbers. Wald's lemma states that if $Y , Y _ { 1 } , Y _ { 2 } , \dots$ are independent random variables with common expected value (cf. also [[Random variable|Random variable]]) and $N$ is a [[Stopping time|stopping time]] (i.e., $N = k$ is determined by the observations $Y _ { 1 } , \dots , Y _ { k }$) with finite expected value, then letting $S _ { n } = Y _ { 1 } + \ldots + Y _ { n }$, | This formula is a special case of Wald's lemma (see [[#References|[a3]]] and [[Wald identity|Wald identity]]), which is very useful in finding or approximating average sample numbers. Wald's lemma states that if $Y , Y _ { 1 } , Y _ { 2 } , \dots$ are independent random variables with common expected value (cf. also [[Random variable|Random variable]]) and $N$ is a [[Stopping time|stopping time]] (i.e., $N = k$ is determined by the observations $Y _ { 1 } , \dots , Y _ { k }$) with finite expected value, then letting $S _ { n } = Y _ { 1 } + \ldots + Y _ { n }$, | ||
Line 31: | Line 31: | ||
In this example, $Y _ { i } = X _ { i }$, $\mathsf{E} ( Y ) = \theta$, and $\mathsf E _ { \theta } ( S _ { N } ) = \mathsf P _ { \theta } ( S _ { N } = 1 ) = 1 - \mathsf P _ { \theta } ( S _ { n } = 0 ) = 1 - ( 1 - \theta ) ^ { n }$. The average sample number then follows from (a3) and agrees with (a2). See [[#References|[a1]]] for asymptotic properties of the average sample number for curtailed tests in general. | In this example, $Y _ { i } = X _ { i }$, $\mathsf{E} ( Y ) = \theta$, and $\mathsf E _ { \theta } ( S _ { N } ) = \mathsf P _ { \theta } ( S _ { N } = 1 ) = 1 - \mathsf P _ { \theta } ( S _ { n } = 0 ) = 1 - ( 1 - \theta ) ^ { n }$. The average sample number then follows from (a3) and agrees with (a2). See [[#References|[a1]]] for asymptotic properties of the average sample number for curtailed tests in general. | ||
− | In testing $H _ { 0 } : \theta = p$ versus $H _ { 1 } : \theta = q = 1 - p$, the logarithm of the likelihood ratio (cf. also [[Likelihood-ratio test|Likelihood-ratio test]]) after $n$ observations is easily seen to be of the form $S_n \operatorname { log } ( q / p )$, where $Y _ { i } = 2 X _ { i } - 1$. Thus, if $p | + | In testing $H _ { 0 } : \theta = p$ versus $H _ { 1 } : \theta = q = 1 - p$, the logarithm of the likelihood ratio (cf. also [[Likelihood-ratio test|Likelihood-ratio test]]) after $n$ observations is easily seen to be of the form $S_n \operatorname { log } ( q / p )$, where $Y _ { i } = 2 X _ { i } - 1$. Thus, if $p < .5$, the sequential probability ratio test stops the first time that $S _ { n } = K$ and decides $H _ { 1 }$ or the first time that $S _ { n } = - J$ and decides $H _ { 0 }$ for positive integers $J$ and $K$. In this case $S _ { 1 } , S _ { 2 } , \ldots$ is a [[Random walk|random walk]] taking steps to the right with probability $\theta$ and $\mathsf{E} ( Y ) = 2 \theta - 1$ in formula (a3). Thus, if $\theta \neq 1 / 2$, the average sample number is |
\begin{equation} \tag{a4} \mathsf{E} _ { \theta } ( N ) = \frac { \mathsf{P} _ { \theta } ( S _ { N } = K ) K - \mathsf{P} _ { \theta } ( S _ { N } = - J ) J } { 2 \theta - 1 }. \end{equation} | \begin{equation} \tag{a4} \mathsf{E} _ { \theta } ( N ) = \frac { \mathsf{P} _ { \theta } ( S _ { N } = K ) K - \mathsf{P} _ { \theta } ( S _ { N } = - J ) J } { 2 \theta - 1 }. \end{equation} | ||
Line 37: | Line 37: | ||
Well-known formulas from the theory of random walks show that $\mathsf{P} _ { \theta } ( S _ { N } = K ) = ( 1 - r ^ { J } ) ( 1 - r ^ { K + J } ) ^ { - 1 }$, where $r = ( 1 - \theta ) / \theta$. | Well-known formulas from the theory of random walks show that $\mathsf{P} _ { \theta } ( S _ { N } = K ) = ( 1 - r ^ { J } ) ( 1 - r ^ { K + J } ) ^ { - 1 }$, where $r = ( 1 - \theta ) / \theta$. | ||
− | If $\theta = .5$, this method fails. One must then use another result of A. Wald, stating that if $ \mathsf{E} ( Y ) = 0$, but $\mathsf{E} ( Y _ { i } ^ { 2 } ) = \sigma ^ { 2 } | + | If $\theta = .5$, this method fails. One must then use another result of A. Wald, stating that if $ \mathsf{E} ( Y ) = 0$, but $\mathsf{E} ( Y _ { i } ^ { 2 } ) = \sigma ^ { 2 } < \infty$, then |
\begin{equation} \tag{a5} \sigma ^ { 2 } \mathsf{E} ( N ) = \mathsf{E} ( S _ { N } ^ { 2 } ). \end{equation} | \begin{equation} \tag{a5} \sigma ^ { 2 } \mathsf{E} ( N ) = \mathsf{E} ( S _ { N } ^ { 2 } ). \end{equation} | ||
Line 60: | Line 60: | ||
====References==== | ====References==== | ||
− | <table><tr><td valign="top">[a1]</td> <td valign="top"> | + | <table> |
+ | <tr><td valign="top">[a1]</td> <td valign="top"> B. Eisenberg, B.K. Ghosh, "Curtailed and uniformly most powerful sequential tests" ''Ann. Statist.'' , '''8''' (1980) pp. 1123–1131</td></tr> | ||
+ | <tr><td valign="top">[a2]</td> <td valign="top"> D. Siemund, "Sequential analysis: Tests and confidence intervals" , Springer (1985)</td></tr> | ||
+ | <tr><td valign="top">[a3]</td> <td valign="top"> A. Wald, "Sequential analysis" , Wiley (1947)</td></tr> | ||
+ | </table> |
Latest revision as of 19:33, 2 February 2024
ASN
A term occurring in the area of statistics called sequential analysis. In sequential procedures for statistical estimation, hypothesis testing or decision theory, the number of observations taken (or sample size) is not pre-determined, but depends on the observations themselves. The expected value of the sample size of such a procedure is called the average sample number. This depends on the underlying distribution of the observations, which is often unknown. If the distribution is determined by the value of some parameter, then the average sample number becomes a function of that parameter.
If $N$ denotes the sample size and $\theta$ is the unknown parameter, then the average sample number is given by
\begin{equation} \tag{a1} \mathsf{E} _ { \theta } ( N ) = \sum _ { n = 1 } ^ { \infty } n \mathsf{P} _ { \theta } ( N = n ) = \sum _ { n = 0 } ^ { \infty } \mathsf{P} _ { \theta } ( N > n ). \end{equation}
Some aspects of the average sample number in the case where the observations are independent Bernoulli random variables $X_i$ with $\mathsf{E} _ { \theta } ( X _ { i } ) = \mathsf{P} _ { \theta } ( X _ { i } = 1 ) = \theta = 1 - \mathsf{P} _ { \theta } ( X _ { i } = 0 )$ are given below.
Consider the case of the curtailed version of a test of the hypotheses $H _ { 0 } : \theta = 0$ versus $H _ { 1 } : \theta > 0$ that takes $n$ observations and decides $H _ { 1 }$ if $X _ { 1 } + \ldots + X _ { n } > 0$ (cf. also Statistical hypotheses, verification of). The curtailed version of this test is sequential and stops the first time that $X _ { k } = 1$ or at time $n$, whichever comes first. Then the average sample number is given by
\begin{equation} \tag{a2} \mathsf{E} _ { \theta } ( N ) = \sum _ { k = 0 } ^ { n - 1 } \mathsf{P} _ { \theta } ( N > k ) = \sum _ { k = 0 } ^ { n - 1 } ( 1 - \theta ) ^ { k } = \end{equation}
\begin{equation*} = \frac { 1 - ( 1 - \theta ) ^ { n } } { \theta } \text { for } \theta > 0. \end{equation*}
This formula is a special case of Wald's lemma (see [a3] and Wald identity), which is very useful in finding or approximating average sample numbers. Wald's lemma states that if $Y , Y _ { 1 } , Y _ { 2 } , \dots$ are independent random variables with common expected value (cf. also Random variable) and $N$ is a stopping time (i.e., $N = k$ is determined by the observations $Y _ { 1 } , \dots , Y _ { k }$) with finite expected value, then letting $S _ { n } = Y _ { 1 } + \ldots + Y _ { n }$,
\begin{equation} \tag{a3} \mathsf{E} ( Y ) \mathsf{E} ( N ) = \mathsf{E} ( S _ { N } ). \end{equation}
Thus, for $\mathsf{E} ( Y ) \neq 0$, one has $\mathsf{E} ( N ) = \mathsf{E} ( S _ { N } ) ( \mathsf{E} ( Y ) ) ^ { - 1 }$.
In this example, $Y _ { i } = X _ { i }$, $\mathsf{E} ( Y ) = \theta$, and $\mathsf E _ { \theta } ( S _ { N } ) = \mathsf P _ { \theta } ( S _ { N } = 1 ) = 1 - \mathsf P _ { \theta } ( S _ { n } = 0 ) = 1 - ( 1 - \theta ) ^ { n }$. The average sample number then follows from (a3) and agrees with (a2). See [a1] for asymptotic properties of the average sample number for curtailed tests in general.
In testing $H _ { 0 } : \theta = p$ versus $H _ { 1 } : \theta = q = 1 - p$, the logarithm of the likelihood ratio (cf. also Likelihood-ratio test) after $n$ observations is easily seen to be of the form $S_n \operatorname { log } ( q / p )$, where $Y _ { i } = 2 X _ { i } - 1$. Thus, if $p < .5$, the sequential probability ratio test stops the first time that $S _ { n } = K$ and decides $H _ { 1 }$ or the first time that $S _ { n } = - J$ and decides $H _ { 0 }$ for positive integers $J$ and $K$. In this case $S _ { 1 } , S _ { 2 } , \ldots$ is a random walk taking steps to the right with probability $\theta$ and $\mathsf{E} ( Y ) = 2 \theta - 1$ in formula (a3). Thus, if $\theta \neq 1 / 2$, the average sample number is
\begin{equation} \tag{a4} \mathsf{E} _ { \theta } ( N ) = \frac { \mathsf{P} _ { \theta } ( S _ { N } = K ) K - \mathsf{P} _ { \theta } ( S _ { N } = - J ) J } { 2 \theta - 1 }. \end{equation}
Well-known formulas from the theory of random walks show that $\mathsf{P} _ { \theta } ( S _ { N } = K ) = ( 1 - r ^ { J } ) ( 1 - r ^ { K + J } ) ^ { - 1 }$, where $r = ( 1 - \theta ) / \theta$.
If $\theta = .5$, this method fails. One must then use another result of A. Wald, stating that if $ \mathsf{E} ( Y ) = 0$, but $\mathsf{E} ( Y _ { i } ^ { 2 } ) = \sigma ^ { 2 } < \infty$, then
\begin{equation} \tag{a5} \sigma ^ { 2 } \mathsf{E} ( N ) = \mathsf{E} ( S _ { N } ^ { 2 } ). \end{equation}
In this example, for $\theta = .5$, one has $\sigma ^ { 2 } = .25$ and $\mathsf{P} ( S _ { N } = K ) = J ( J + K ) ^ { - 1 }$. Then (a5) yields $\mathsf E ( N ) = 4 JK$.
In order to decide which of two sequential tests is better, it is important to be able to express the average sample number in terms of the error probabilities of the test. Then the average sample numbers of different tests with the same error probabilities can be compared. In this example, the probability of a type-I error $\alpha = \mathsf{P} _ { p } ( S _ { N } = K )$ and the probability of a type-II error $\beta = \mathsf{P} _ { q } ( S _ { N } = - J )$ (cf. also Error). From this one sees that
\begin{equation*} K = \operatorname { log } \left( \frac { 1 - \beta } { \alpha } \right) \left( \operatorname { log } \frac { q } { p } \right) ^ { - 1 } \end{equation*}
and
\begin{equation*} J = \operatorname { log } \left( \frac { 1 - \alpha } { \beta } \right) \left( \operatorname { log } \frac { q } { p } \right) ^ { - 1 }. \end{equation*}
In particular, it follows that if $\theta = p$, then
\begin{equation} \tag{a6} E_p ( N ) = \frac { \alpha \operatorname { log } ( \frac { 1 - \beta } { \alpha } ) + ( 1 - \alpha ) \operatorname { log } ( \frac { \beta } { 1 - \alpha } ) } { ( p - q ) \operatorname { log } ( q / p ) }. \end{equation}
This formula and the analogous one with $\theta = q$ are often used to measure the efficiency of the sequential probability ratio test against other sequential tests with the same error probabilities
Similar formulas can be found for sequential probability ratio tests for sequences of independent random variables with other distributions, but in general the formulas are approximations since the tests will stop when the logarithm of the likelihood ratio first crosses the boundary rather than when it first hits the boundary. Many studies (see [a2]) consider the adequacy of approximations for the average sample number of sequential probability ratio tests and some other sequential tests whose stopping times are more complicated. In areas such as signal detection, average sample numbers are also studied for sequential tests where the observations are not independent, identically distributed random variables.
References
[a1] | B. Eisenberg, B.K. Ghosh, "Curtailed and uniformly most powerful sequential tests" Ann. Statist. , 8 (1980) pp. 1123–1131 |
[a2] | D. Siemund, "Sequential analysis: Tests and confidence intervals" , Springer (1985) |
[a3] | A. Wald, "Sequential analysis" , Wiley (1947) |
Average sample number. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Average_sample_number&oldid=55364