Namespaces
Variants
Actions

Tolerance intervals

From Encyclopedia of Mathematics
Jump to: navigation, search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.


Random intervals, constructed for independent identically-distributed random variables with unknown distribution function $ F ( x) $, containing with given probability $ \gamma $ at least a proportion $ p $ ($ 0 < p < 1 $) of the probability measure $ dF $.

Let $ X _ {1} \dots X _ {n} $ be independent and identically-distributed random variables with unknown distribution function $ F ( x) $, and let $ T _ {1} = T _ {1} ( X _ {1} \dots X _ {n} ) $, $ T _ {2} = T _ {2} ( X _ {1} \dots X _ {n} ) $ be statistics such that, for a number $ p $( $ 0 < p < 1 $) fixed in advance, the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) > p \} $ has a given probability $ \gamma $, that is,

$$ \tag{1 } {\mathsf P} \left \{ \int\limits _ { T _ {1} } ^ { {T _ 2 } } dF ( x) \geq p \right \} = \gamma . $$

In this case the random interval $ ( T _ {1} , T _ {2} ) $ is called a $ \gamma $- tolerance interval for the distribution function $ F ( x) $, its end points $ T _ {1} $ and $ T _ {2} $ are called tolerance bounds, and the probability $ \gamma $ is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds $ T _ {1} $ and $ T _ {2} $( i.e. with $ T _ {2} = + \infty $, respectively $ T _ {1} = - \infty $) are the usual one-sided confidence bounds with confidence coefficient $ \gamma $ for the quantiles $ x _ {1 - p } = F ^ { - 1 } ( 1 - p) $ and $ x _ {p} = F ^ { - 1 } ( p) $, respectively, that is,

$$ {\mathsf P} \{ x _ {1 - p } \in [ T _ {1} , + \infty ) \} = \gamma , $$

$$ {\mathsf P} \{ x _ {p} \in (- \infty , T _ {2} ] \} = \gamma . $$

Example. Let $ X _ {1} \dots X _ {n} $ be independent random variables having a normal distribution $ N ( a, \sigma ^ {2} ) $ with unknown parameters $ a $ and $ \sigma ^ {2} $. In this case it is natural to take the tolerance bounds $ T _ {1} $ and $ T _ {2} $ to be functions of the sufficient statistic $ ( \overline{X}\; , S ^ {2} ) $, where

$$ \overline{X}\; = \ { \frac{X _ {1} + \dots + X _ {n} }{n} } ,\ \ S ^ {2} = \ { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - \overline{X}\; ) ^ {2} . $$

Specifically, one takes $ T _ {1} = \overline{X}\; - kS ^ {2} $ and $ T _ {2} = \overline{X}\; + kS ^ {2} $, where the constant $ k $, called the tolerance multiplier, is obtained as the solution to the equation

$$ {\mathsf P} \left \{ \Phi \left ( { \frac{\overline{X}\; + kS - a } \sigma } \right ) - \Phi \left ( { \frac{\overline{X}\; - kS - a } \sigma } \right ) \geq p \right \} = \gamma , $$

where $ \Phi ( x) $ is the distribution function of the standard normal law; moreover, $ k = k ( n, \gamma , p) $ does not depend on the unknown parameters $ a $ and $ \sigma ^ {2} $. The tolerance interval constructed in this way satisfies the following property: With confidence probability $ \gamma $ the interval $ ( \overline{X}\; - kS ^ {2} , \overline{X}\; + kS ^ {2} ) $ contains at least a proportion $ p $ of the probability mass of the normal distribution of the variables $ X _ {1} \dots X _ {n} $.

Assuming the existence of a probability density function $ f ( x) = F ^ { \prime } ( x) $, the probability of the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) \geq p \} $ is independent of $ F ( x) $ if and only if $ T _ {1} $ and $ T _ {2} $ are order statistics (cf. Order statistic). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let $ X ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} ) $ be the vector of order statistics constructed from the sample $ X _ {1} \dots X _ {n} $ and let

$$ T _ {1} = X _ {( nr)} ,\ \ T _ {2} = X _ {( ns)} ,\ \ 1 \leq r < s \leq n. $$

Since the random variable $ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) $ has the beta-distribution with parameters $ s - r $ and $ n - s + r + 1 $, the probability of the event $ \{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq p \} $ can be calculated as the integral $ I _ {1 - p } ( n - s + r + 1, s - r) $, where $ I _ {x} ( a, b) $ is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation

$$ \tag{2 } I _ {1 - p } ( n - s + r + 1, s - r) = \gamma , $$

which allows one, for given $ \gamma $, $ p $ and $ n $, to define numbers $ r $ and $ s $ so that the order statistics $ X _ {( nr)} $ and $ X _ {( ns)} $ are the tolerance bounds of the desired tolerance interval. Moreover, for given $ \gamma $, $ p $, $ r $, relation (2) allows one to determine the size $ n $ of the collection $ X _ {1} \dots X _ {n} $ necessary for the relation (2) to hold. There are statistical tables available for solving such problems.

References

[1] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[2] S.S. Wilks, "Mathematical statistics" , Wiley (1962)
[3] H.H. David, "Order statistics" , Wiley (1981)
[4] R.B. Murphy, "Non-parametric tolerance limits" Ann. Math. Stat. , 19 (1948) pp. 581–589
[5] P.N. Somerville, "Tables for obtaining non-parametric tolerance limits" Ann. Math. Stat. , 29 (1958) pp. 599–601
[6] H. Scheffé, J.W. Tukey, "Non-parametric estimation I. Validation of order statistics" Ann. Math. Stat. , 16 (1945) pp. 187–192
[7] D.A.S. Fraser, "Nonparametric methods in statistics" , Wiley (1957)
[8] A. Wald, J. Wolfowitz, "Tolerance limits for a normal distribution" Ann. Math. Stat. , 17 (1946) pp. 208–215
[9] H. Robbins, "On distribution-free tolerance limits in random sampling" Ann. Math. Stat. , 15 (1944) pp. 214–216
How to Cite This Entry:
Tolerance intervals. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Tolerance_intervals&oldid=51752
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article