# Tolerance intervals

Random intervals, constructed for independent identically-distributed random variables with unknown distribution function $F ( x)$, containing with given probability $\gamma$ at least a proportion $p$ ($0 < p < 1$) of the probability measure $dF$.

Let $X _ {1} \dots X _ {n}$ be independent and identically-distributed random variables with unknown distribution function $F ( x)$, and let $T _ {1} = T _ {1} ( X _ {1} \dots X _ {n} )$, $T _ {2} = T _ {2} ( X _ {1} \dots X _ {n} )$ be statistics such that, for a number $p$( $0 < p < 1$) fixed in advance, the event $\{ F ( T _ {2} ) - F ( T _ {1} ) > p \}$ has a given probability $\gamma$, that is,

$$\tag{1 } {\mathsf P} \left \{ \int\limits _ { T _ {1} } ^ { {T _ 2 } } dF ( x) \geq p \right \} = \gamma .$$

In this case the random interval $( T _ {1} , T _ {2} )$ is called a $\gamma$- tolerance interval for the distribution function $F ( x)$, its end points $T _ {1}$ and $T _ {2}$ are called tolerance bounds, and the probability $\gamma$ is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds $T _ {1}$ and $T _ {2}$( i.e. with $T _ {2} = + \infty$, respectively $T _ {1} = - \infty$) are the usual one-sided confidence bounds with confidence coefficient $\gamma$ for the quantiles $x _ {1 - p } = F ^ { - 1 } ( 1 - p)$ and $x _ {p} = F ^ { - 1 } ( p)$, respectively, that is,

$${\mathsf P} \{ x _ {1 - p } \in [ T _ {1} , + \infty ) \} = \gamma ,$$

$${\mathsf P} \{ x _ {p} \in (- \infty , T _ {2} ] \} = \gamma .$$

Example. Let $X _ {1} \dots X _ {n}$ be independent random variables having a normal distribution $N ( a, \sigma ^ {2} )$ with unknown parameters $a$ and $\sigma ^ {2}$. In this case it is natural to take the tolerance bounds $T _ {1}$ and $T _ {2}$ to be functions of the sufficient statistic $( \overline{X}\; , S ^ {2} )$, where

$$\overline{X}\; = \ { \frac{X _ {1} + \dots + X _ {n} }{n} } ,\ \ S ^ {2} = \ { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - \overline{X}\; ) ^ {2} .$$

Specifically, one takes $T _ {1} = \overline{X}\; - kS ^ {2}$ and $T _ {2} = \overline{X}\; + kS ^ {2}$, where the constant $k$, called the tolerance multiplier, is obtained as the solution to the equation

$${\mathsf P} \left \{ \Phi \left ( { \frac{\overline{X}\; + kS - a } \sigma } \right ) - \Phi \left ( { \frac{\overline{X}\; - kS - a } \sigma } \right ) \geq p \right \} = \gamma ,$$

where $\Phi ( x)$ is the distribution function of the standard normal law; moreover, $k = k ( n, \gamma , p)$ does not depend on the unknown parameters $a$ and $\sigma ^ {2}$. The tolerance interval constructed in this way satisfies the following property: With confidence probability $\gamma$ the interval $( \overline{X}\; - kS ^ {2} , \overline{X}\; + kS ^ {2} )$ contains at least a proportion $p$ of the probability mass of the normal distribution of the variables $X _ {1} \dots X _ {n}$.

Assuming the existence of a probability density function $f ( x) = F ^ { \prime } ( x)$, the probability of the event $\{ F ( T _ {2} ) - F ( T _ {1} ) \geq p \}$ is independent of $F ( x)$ if and only if $T _ {1}$ and $T _ {2}$ are order statistics (cf. Order statistic). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let $X ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} )$ be the vector of order statistics constructed from the sample $X _ {1} \dots X _ {n}$ and let

$$T _ {1} = X _ {( nr)} ,\ \ T _ {2} = X _ {( ns)} ,\ \ 1 \leq r < s \leq n.$$

Since the random variable $F ( X _ {( ns)} ) - F ( X _ {( nr)} )$ has the beta-distribution with parameters $s - r$ and $n - s + r + 1$, the probability of the event $\{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq p \}$ can be calculated as the integral $I _ {1 - p } ( n - s + r + 1, s - r)$, where $I _ {x} ( a, b)$ is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation

$$\tag{2 } I _ {1 - p } ( n - s + r + 1, s - r) = \gamma ,$$

which allows one, for given $\gamma$, $p$ and $n$, to define numbers $r$ and $s$ so that the order statistics $X _ {( nr)}$ and $X _ {( ns)}$ are the tolerance bounds of the desired tolerance interval. Moreover, for given $\gamma$, $p$, $r$, relation (2) allows one to determine the size $n$ of the collection $X _ {1} \dots X _ {n}$ necessary for the relation (2) to hold. There are statistical tables available for solving such problems.

#### References

 [1] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) [2] S.S. Wilks, "Mathematical statistics" , Wiley (1962) [3] H.H. David, "Order statistics" , Wiley (1981) [4] R.B. Murphy, "Non-parametric tolerance limits" Ann. Math. Stat. , 19 (1948) pp. 581–589 [5] P.N. Somerville, "Tables for obtaining non-parametric tolerance limits" Ann. Math. Stat. , 29 (1958) pp. 599–601 [6] H. Scheffé, J.W. Tukey, "Non-parametric estimation I. Validation of order statistics" Ann. Math. Stat. , 16 (1945) pp. 187–192 [7] D.A.S. Fraser, "Nonparametric methods in statistics" , Wiley (1957) [8] A. Wald, J. Wolfowitz, "Tolerance limits for a normal distribution" Ann. Math. Stat. , 17 (1946) pp. 208–215 [9] H. Robbins, "On distribution-free tolerance limits in random sampling" Ann. Math. Stat. , 15 (1944) pp. 214–216
How to Cite This Entry:
Tolerance intervals. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Tolerance_intervals&oldid=51752
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article