Difference between revisions of "Sequential probability ratio test"
(Importing text file) |
m (texed) |
||
Line 1: | Line 1: | ||
+ | $$\newcommand{\P}{\mathsf{P}}$$ | ||
''SPRT'' | ''SPRT'' | ||
− | Let | + | Let $X_1, X_2, \ldots$ be a sequence of independent random variables with common discrete or continuous probability density function $f$ (cf. also [[Random variable|Random variable]]; [[Distribution function|Distribution function]]). In testing the hypotheses $H_0 : f = f_0$ versus $H_1 : f = f_1$ (cf. also [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]]), the probability ratio or likelihood ratio after $n$ observations is given by (cf. also [[Likelihood-ratio test|Likelihood-ratio test]]) |
− | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> | + | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> |
+ | $$ | ||
+ | \lambda_n = \prod_{i=1}^n \frac{f_1(X_i)}{f_0(X_i)}. | ||
+ | $$ | ||
+ | </td> <td valign="top" style="width:5%;text-align:right;">(a1)</td></tr></table> | ||
− | In the Neyman–Pearson probability ratio test, one fixes a sample size | + | In the Neyman–Pearson probability ratio test, one fixes a sample size $n$, chooses a value $k$ and decides to accept $H_1$ if $\lambda_n \ge k$ and decides to accept $H_0$ if $\lambda < k$. In the sequential probability ratio test introduced by A. Wald [[#References|[a5]]], the sample size is not chosen ahead of time. Instead, one chooses two positive constants $A < B$ and sequentially computes the likelihood ratio after each observation. If after $n$ observations $\lambda_n \le A$, one stops taking observations and decides to accept $H_0$. If $\lambda_n \ge B$, one stops and decides to accept $H_1$. Otherwise, another observation is taken. The number of observations taken is thus a random variable $N$, which can be shown to be finite with probability one. |
− | Denote the error probabilities of this procedure by | + | Denote the error probabilities of this procedure by $\alpha = \P_0(\text{accept } H_1) = \P_0(\lambda_N \ge B)$ and $\beta = \P_1(\text{accept } H_0) = \P_1(\lambda_N \le A)$. It then follows that |
− | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> | + | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> |
+ | $$ | ||
+ | \begin{gathered} | ||
+ | \beta = \int_{\lambda_N \le A} \lambda_N \, d\P_0 \le A(1-\alpha), \\ | ||
+ | 1-\beta = \int_{\lambda_N \ge B} \lambda_N \, d\P_0 \ge B \alpha . | ||
+ | \end{gathered} | ||
+ | $$ | ||
+ | </td> <td valign="top" style="width:5%;text-align:right;">(a2)</td></tr></table> | ||
− | + | If the likelihood ratio always hits the boundary when the test stops, so that $\lambda_N = A$ or $\lambda_N = B$, then these inequalities become equalities. Otherwise, the inequalities become close approximations in the standard cases. | |
− | + | The logarithm of the likelihood ratio as given in (a1) is a sum of independent, identically distributed random variables $Z_i = \log f_1(X_i) / f_0(X_i)$. It then follows from Wald's lemma that $\E(Z_i) \E(N) = \E(\log \lambda_N)$. Using the same type of approximations as above, this gives the following formulas for the [[Average sample number|average sample number]] of the test: | |
− | + | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> | |
− | + | $$ | |
− | <table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"> | + | \begin{gathered} |
− | + | \E_0(N) \approx \frac{\alpha \log\left(\frac{1-\beta}{\alpha}\right) + (1-\alpha) \log\left(\frac{\beta}{1-\alpha}\right)}{\E_0(Z_i)} ,\\ | |
− | + | \E_1(N) \approx \frac{ (1-\beta) \log\left(\frac{1-\beta}{\alpha}\right) + \beta \log\left(\frac{\beta}{1-\alpha}\right)}{\E_1(Z_i)} . | |
+ | \end{gathered} | ||
+ | $$ | ||
+ | </td> <td valign="top" style="width:5%;text-align:right;">(a3)</td></tr></table> | ||
If the likelihood ratio always hits the boundary when the test stops, these approximations become equalities. | If the likelihood ratio always hits the boundary when the test stops, these approximations become equalities. | ||
− | Wald and J. Wolfowitz [[#References|[a6]]] proved a strong optimality property for the sequential probability ratio test. It states that among all sequential tests with error probabilities no bigger than that of a given sequential probability ratio test, the sequential probability ratio test has the smallest average sample number under both | + | Wald and J. Wolfowitz [[#References|[a6]]] proved a strong optimality property for the sequential probability ratio test. It states that among all sequential tests with error probabilities no bigger than that of a given sequential probability ratio test, the sequential probability ratio test has the smallest average sample number under both $H_0$ and $H_1$. Indeed, the average savings in sampling relative to the Neyman–Pearson test with the same error probabilities is 50% or more in many cases (see [[#References|[a2]]] for details). |
− | In most realistic situations the hypotheses to be tested are composite, the probability distributions are parametrized by some parameter | + | In most realistic situations the hypotheses to be tested are composite, the probability distributions are parametrized by some parameter $\theta$, and one is interested in testing $H_0 : \theta \le \theta_0$ versus $H_1 : \theta \ge \theta_1$. In such a case one can perform the sequential probability ratio test for the simple hypotheses $H_0 : \theta = \theta_0$ versus $H_1 : \theta = \theta_1$. In the most important cases one can apply the fundamental identity of [[Sequential analysis|sequential analysis]] (see [[#References|[a2]]] for details) to find the approximate power functions and average sample number functions for such procedures. However, even when such tests achieve specified error probabilities for all values of the parameter, the Wald–Wolfowitz optimality will not carry over to values of $\theta$ other than $\theta_0$ and $\theta_1$. Indeed, the expected sample size may even exceed that of the corresponding fixed sample size test at some values of $\theta$ between $\theta_0$ and $\theta_1$. |
− | It is because of this phenomenon that J. Kiefer and L. Weiss [[#References|[a4]]] raised the question of finding the sequential test with given error probabilities at | + | It is because of this phenomenon that J. Kiefer and L. Weiss [[#References|[a4]]] raised the question of finding the sequential test with given error probabilities at $\theta_0$ and $\theta_1$, which minimizes the maximum expected sample size over all $\theta$. To solve this problem, G. Lorden [[#References|[a3]]] introduced a test based on performing two one-sided sequential probability ratio tests simultaneously. First, a third value $\theta^*$ is chosen. Test one is then a sequential probability ratio test of $H_0 : \theta = \theta_0$ versus $H_1 : \theta = \theta^*$, where the constant $A = 0$. Test two is a sequential probability ratio test of $H_0 : \theta = \theta_1$ versus $H_1 : \theta = \theta^*$, where $A=0$. The $2$-sequential probability ratio test stops and makes its decision as soon as either one of the individual sequential probability ratio tests stops. The decision is to accept $H_1$ if test one stops first and the decision is to accept $H_0$ if the second test stops first. It can be shown that for the proper choice of $\theta^*$ this test does asymptotically solve the Kiefer–Weiss problem (see [[#References|[a1]]] for a simple way to select $\theta^*$). |
− | The sequential probability ratio test can also be applied in situations where the observations are not independent and identically distributed. In such a situation the likelihood ratio | + | The sequential probability ratio test can also be applied in situations where the observations are not independent and identically distributed. In such a situation the likelihood ratio $\lambda_n$ can be more difficult to compute at each stage. The inequalities (a2) continue to hold for such a sequential probability ratio test, but the formulas (a3) for the average sample number are no longer valid. The sequential probability ratio test has also been studied where the observations form a continuous-time [[Stochastic process|stochastic process]]. In fact, parts of the theory simplify in such a situation, since the likelihood ratio process often becomes a process with continuous sample paths and thus always hits the boundary when it stops. |
====References==== | ====References==== | ||
− | <table><TR><TD valign="top">[a1]</TD> <TD valign="top"> B. Eisenberg, "The asymptotic solution of the Kiefer–Weiss problem" ''Sequential Anal.'' , '''1''' (1982) pp. 81–88</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> B.K. Ghosh, "Sequential tests of statistical hypotheses" , Addison-Wesley (1970)</TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> G. Lorden, " | + | <table> |
+ | <TR><TD valign="top">[a1]</TD> <TD valign="top"> B. Eisenberg, "The asymptotic solution of the Kiefer–Weiss problem" ''Sequential Anal.'' , '''1''' (1982) pp. 81–88</TD></TR> | ||
+ | <TR><TD valign="top">[a2]</TD> <TD valign="top"> B.K. Ghosh, "Sequential tests of statistical hypotheses" , Addison-Wesley (1970)</TD></TR> | ||
+ | <TR><TD valign="top">[a3]</TD> <TD valign="top"> G. Lorden, "$2$-SPRT's and the modified Kiefer–Weiss problem of minimizing an expected sample size" ''Ann. Statist.'' , '''4''' (1976) pp. 281–291</TD></TR> | ||
+ | <TR><TD valign="top">[a4]</TD> <TD valign="top"> L. Weiss, "On sequential tests which minimize the maximum expected sample size" ''J. Amer. Statist. Assoc.'' , '''57''' (1962) pp. 551–566</TD></TR> | ||
+ | <TR><TD valign="top">[a5]</TD> <TD valign="top"> A. Wald, "Sequential analysis" , Wiley (1947)</TD></TR> | ||
+ | <TR><TD valign="top">[a6]</TD> <TD valign="top"> A. Wald, J. Wolfowitz, "Optimum character of the sequential probability ratio test" ''Ann. Math. Stat.'' , '''19''' (1948) pp. 326–339</TD></TR> | ||
+ | </table> |
Latest revision as of 12:30, 13 February 2024
$$\newcommand{\P}{\mathsf{P}}$$ SPRT
Let $X_1, X_2, \ldots$ be a sequence of independent random variables with common discrete or continuous probability density function $f$ (cf. also Random variable; Distribution function). In testing the hypotheses $H_0 : f = f_0$ versus $H_1 : f = f_1$ (cf. also Statistical hypotheses, verification of), the probability ratio or likelihood ratio after $n$ observations is given by (cf. also Likelihood-ratio test)
$$ \lambda_n = \prod_{i=1}^n \frac{f_1(X_i)}{f_0(X_i)}. $$ | (a1) |
In the Neyman–Pearson probability ratio test, one fixes a sample size $n$, chooses a value $k$ and decides to accept $H_1$ if $\lambda_n \ge k$ and decides to accept $H_0$ if $\lambda < k$. In the sequential probability ratio test introduced by A. Wald [a5], the sample size is not chosen ahead of time. Instead, one chooses two positive constants $A < B$ and sequentially computes the likelihood ratio after each observation. If after $n$ observations $\lambda_n \le A$, one stops taking observations and decides to accept $H_0$. If $\lambda_n \ge B$, one stops and decides to accept $H_1$. Otherwise, another observation is taken. The number of observations taken is thus a random variable $N$, which can be shown to be finite with probability one.
Denote the error probabilities of this procedure by $\alpha = \P_0(\text{accept } H_1) = \P_0(\lambda_N \ge B)$ and $\beta = \P_1(\text{accept } H_0) = \P_1(\lambda_N \le A)$. It then follows that
$$ \begin{gathered} \beta = \int_{\lambda_N \le A} \lambda_N \, d\P_0 \le A(1-\alpha), \\ 1-\beta = \int_{\lambda_N \ge B} \lambda_N \, d\P_0 \ge B \alpha . \end{gathered} $$ | (a2) |
If the likelihood ratio always hits the boundary when the test stops, so that $\lambda_N = A$ or $\lambda_N = B$, then these inequalities become equalities. Otherwise, the inequalities become close approximations in the standard cases.
The logarithm of the likelihood ratio as given in (a1) is a sum of independent, identically distributed random variables $Z_i = \log f_1(X_i) / f_0(X_i)$. It then follows from Wald's lemma that $\E(Z_i) \E(N) = \E(\log \lambda_N)$. Using the same type of approximations as above, this gives the following formulas for the average sample number of the test:
$$ \begin{gathered} \E_0(N) \approx \frac{\alpha \log\left(\frac{1-\beta}{\alpha}\right) + (1-\alpha) \log\left(\frac{\beta}{1-\alpha}\right)}{\E_0(Z_i)} ,\\ \E_1(N) \approx \frac{ (1-\beta) \log\left(\frac{1-\beta}{\alpha}\right) + \beta \log\left(\frac{\beta}{1-\alpha}\right)}{\E_1(Z_i)} . \end{gathered} $$ | (a3) |
If the likelihood ratio always hits the boundary when the test stops, these approximations become equalities.
Wald and J. Wolfowitz [a6] proved a strong optimality property for the sequential probability ratio test. It states that among all sequential tests with error probabilities no bigger than that of a given sequential probability ratio test, the sequential probability ratio test has the smallest average sample number under both $H_0$ and $H_1$. Indeed, the average savings in sampling relative to the Neyman–Pearson test with the same error probabilities is 50% or more in many cases (see [a2] for details).
In most realistic situations the hypotheses to be tested are composite, the probability distributions are parametrized by some parameter $\theta$, and one is interested in testing $H_0 : \theta \le \theta_0$ versus $H_1 : \theta \ge \theta_1$. In such a case one can perform the sequential probability ratio test for the simple hypotheses $H_0 : \theta = \theta_0$ versus $H_1 : \theta = \theta_1$. In the most important cases one can apply the fundamental identity of sequential analysis (see [a2] for details) to find the approximate power functions and average sample number functions for such procedures. However, even when such tests achieve specified error probabilities for all values of the parameter, the Wald–Wolfowitz optimality will not carry over to values of $\theta$ other than $\theta_0$ and $\theta_1$. Indeed, the expected sample size may even exceed that of the corresponding fixed sample size test at some values of $\theta$ between $\theta_0$ and $\theta_1$.
It is because of this phenomenon that J. Kiefer and L. Weiss [a4] raised the question of finding the sequential test with given error probabilities at $\theta_0$ and $\theta_1$, which minimizes the maximum expected sample size over all $\theta$. To solve this problem, G. Lorden [a3] introduced a test based on performing two one-sided sequential probability ratio tests simultaneously. First, a third value $\theta^*$ is chosen. Test one is then a sequential probability ratio test of $H_0 : \theta = \theta_0$ versus $H_1 : \theta = \theta^*$, where the constant $A = 0$. Test two is a sequential probability ratio test of $H_0 : \theta = \theta_1$ versus $H_1 : \theta = \theta^*$, where $A=0$. The $2$-sequential probability ratio test stops and makes its decision as soon as either one of the individual sequential probability ratio tests stops. The decision is to accept $H_1$ if test one stops first and the decision is to accept $H_0$ if the second test stops first. It can be shown that for the proper choice of $\theta^*$ this test does asymptotically solve the Kiefer–Weiss problem (see [a1] for a simple way to select $\theta^*$).
The sequential probability ratio test can also be applied in situations where the observations are not independent and identically distributed. In such a situation the likelihood ratio $\lambda_n$ can be more difficult to compute at each stage. The inequalities (a2) continue to hold for such a sequential probability ratio test, but the formulas (a3) for the average sample number are no longer valid. The sequential probability ratio test has also been studied where the observations form a continuous-time stochastic process. In fact, parts of the theory simplify in such a situation, since the likelihood ratio process often becomes a process with continuous sample paths and thus always hits the boundary when it stops.
References
[a1] | B. Eisenberg, "The asymptotic solution of the Kiefer–Weiss problem" Sequential Anal. , 1 (1982) pp. 81–88 |
[a2] | B.K. Ghosh, "Sequential tests of statistical hypotheses" , Addison-Wesley (1970) |
[a3] | G. Lorden, "$2$-SPRT's and the modified Kiefer–Weiss problem of minimizing an expected sample size" Ann. Statist. , 4 (1976) pp. 281–291 |
[a4] | L. Weiss, "On sequential tests which minimize the maximum expected sample size" J. Amer. Statist. Assoc. , 57 (1962) pp. 551–566 |
[a5] | A. Wald, "Sequential analysis" , Wiley (1947) |
[a6] | A. Wald, J. Wolfowitz, "Optimum character of the sequential probability ratio test" Ann. Math. Stat. , 19 (1948) pp. 326–339 |
Sequential probability ratio test. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Sequential_probability_ratio_test&oldid=12721