Average sample number

From Encyclopedia of Mathematics
Revision as of 17:17, 7 February 2011 by (talk) (Importing text file)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


A term occurring in the area of statistics called sequential analysis. In sequential procedures for statistical estimation, hypothesis testing or decision theory, the number of observations taken (or sample size) is not pre-determined, but depends on the observations themselves. The expected value of the sample size of such a procedure is called the average sample number. This depends on the underlying distribution of the observations, which is often unknown. If the distribution is determined by the value of some parameter, then the average sample number becomes a function of that parameter.

If denotes the sample size and is the unknown parameter, then the average sample number is given by


Some aspects of the average sample number in the case where the observations are independent Bernoulli random variables with are given below.

Consider the case of the curtailed version of a test of the hypotheses versus that takes observations and decides if (cf. also Statistical hypotheses, verification of). The curtailed version of this test is sequential and stops the first time that or at time , whichever comes first. Then the average sample number is given by


This formula is a special case of Wald's lemma (see [a3] and Wald identity), which is very useful in finding or approximating average sample numbers. Wald's lemma states that if are independent random variables with common expected value (cf. also Random variable) and is a stopping time (i.e., is determined by the observations ) with finite expected value, then letting ,


Thus, for , one has .

In this example, , , and . The average sample number then follows from (a3) and agrees with (a2). See [a1] for asymptotic properties of the average sample number for curtailed tests in general.

In testing versus , the logarithm of the likelihood ratio (cf. also Likelihood-ratio test) after observations is easily seen to be of the form , where . Thus, if , the sequential probability ratio test stops the first time that and decides or the first time that and decides for positive integers and . In this case is a random walk taking steps to the right with probability and in formula (a3). Thus, if , the average sample number is


Well-known formulas from the theory of random walks show that , where .

If , this method fails. One must then use another result of A. Wald, stating that if , but , then


In this example, for , one has and . Then (a5) yields .

In order to decide which of two sequential tests is better, it is important to be able to express the average sample number in terms of the error probabilities of the test. Then the average sample numbers of different tests with the same error probabilities can be compared. In this example, the probability of a type-I error and the probability of a type-II error (cf. also Error). From this one sees that


In particular, it follows that if , then


This formula and the analogous one with are often used to measure the efficiency of the sequential probability ratio test against other sequential tests with the same error probabilities

Similar formulas can be found for sequential probability ratio tests for sequences of independent random variables with other distributions, but in general the formulas are approximations since the tests will stop when the logarithm of the likelihood ratio first crosses the boundary rather than when it first hits the boundary. Many studies (see [a2]) consider the adequacy of approximations for the average sample number of sequential probability ratio tests and some other sequential tests whose stopping times are more complicated. In areas such as signal detection, average sample numbers are also studied for sequential tests where the observations are not independent, identically distributed random variables.


[a1] B. Eisenberg, B.K. Ghosh, "Curtailed and uniformly most powerful sequential tests" Ann. Statist. , 8 (1980) pp. 1123–1131
[a2] D. Siemund, "Sequential analysis: Tests and confidence intervals" , Springer (1985)
[a3] A. Wald, "Sequential analysis" , Wiley (1947)
How to Cite This Entry:
Average sample number. Encyclopedia of Mathematics. URL:
This article was adapted from an original article by Bennett Eisenberg (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article