Namespaces
Variants
Actions

Difference between revisions of "Tolerance intervals"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (fix tex)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
Random intervals, constructed for independent identically-distributed random variables with unknown distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929701.png" />, containing with given probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929702.png" /> at least a proportion <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929703.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929704.png" />) of the probability measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929705.png" />.
+
<!--
 +
t0929701.png
 +
$#A+1 = 76 n = 0
 +
$#C+1 = 76 : ~/encyclopedia/old_files/data/T092/T.0902970 Tolerance intervals
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929706.png" /> be independent and identically-distributed random variables with unknown distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929707.png" />, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929708.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t0929709.png" /> be statistics such that, for a number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297010.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297011.png" />) fixed in advance, the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297012.png" /> has a given probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297013.png" />, that is,
+
{{TEX|auto}}
 +
{{TEX|done}}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297014.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
Random intervals, constructed for independent identically-distributed random variables with unknown distribution function  $  F ( x) $,
 +
containing with given probability  $  \gamma $
 +
at least a proportion  $  p $ ($  0 < p < 1 $)  
 +
of the probability measure  $  dF $.
  
In this case the random interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297015.png" /> is called a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297017.png" />-tolerance interval for the distribution function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297018.png" />, its end points <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297019.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297020.png" /> are called tolerance bounds, and the probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297021.png" /> is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297022.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297023.png" /> (i.e. with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297024.png" />, respectively <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297025.png" />) are the usual one-sided confidence bounds with confidence coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297026.png" /> for the quantiles <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297027.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297028.png" />, respectively, that is,
+
Let  $  X _ {1} \dots X _ {n} $
 +
be independent and identically-distributed random variables with unknown distribution function $  F ( x) $,  
 +
and let  $  T _ {1} = T _ {1} ( X _ {1} \dots X _ {n} ) $,  
 +
$  T _ {2} = T _ {2} ( X _ {1} \dots X _ {n} ) $
 +
be statistics such that, for a number  $  p $(
 +
0 < p < 1 $)
 +
fixed in advance, the event  $  \{ F ( T _ {2} ) - F ( T _ {1} ) > p \} $
 +
has a given probability  $  \gamma $,  
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297029.png" /></td> </tr></table>
+
$$ \tag{1 }
 +
{\mathsf P} \left \{
 +
\int\limits _ { T _ {1} } ^ { {T _ 2 } }
 +
dF ( x) \geq  p
 +
\right \}  = \gamma .
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297030.png" /></td> </tr></table>
+
In this case the random interval  $  ( T _ {1} , T _ {2} ) $
 +
is called a  $  \gamma $-
 +
tolerance interval for the distribution function  $  F ( x) $,
 +
its end points  $  T _ {1} $
 +
and  $  T _ {2} $
 +
are called tolerance bounds, and the probability  $  \gamma $
 +
is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds  $  T _ {1} $
 +
and  $  T _ {2} $(
 +
i.e. with  $  T _ {2} = + \infty $,
 +
respectively  $  T _ {1} = - \infty $)
 +
are the usual one-sided confidence bounds with confidence coefficient  $  \gamma $
 +
for the quantiles  $  x _ {1 - p }  = F ^ { - 1 } ( 1 - p) $
 +
and  $  x _ {p} = F ^ { - 1 } ( p) $,
 +
respectively, that is,
  
Example. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297031.png" /> be independent random variables having a normal distribution <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297032.png" /> with unknown parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297033.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297034.png" />. In this case it is natural to take the tolerance bounds <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297035.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297036.png" /> to be functions of the sufficient statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297037.png" />, where
+
$$
 +
{\mathsf P} \{ x _ {1 - p }  \in [ T _ {1} , + \infty ) \}  = \gamma ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297038.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} \{ x _ {p} \in (- \infty , T _ {2} ] \}  = \gamma .
 +
$$
  
Specifically, one takes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297039.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297040.png" />, where the constant <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297041.png" />, called the tolerance multiplier, is obtained as the solution to the equation
+
Example. Let  $  X _ {1} \dots X _ {n} $
 +
be independent random variables having a normal distribution  $  N ( a, \sigma  ^ {2} ) $
 +
with unknown parameters  $  a $
 +
and $  \sigma  ^ {2} $.  
 +
In this case it is natural to take the tolerance bounds  $  T _ {1} $
 +
and  $  T _ {2} $
 +
to be functions of the [[sufficient statistic]]  $  ( \overline{X}\; , S  ^ {2} ) $,
 +
where
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297042.png" /></td> </tr></table>
+
$$
 +
\overline{X}\;  = \
 +
{
 +
\frac{X _ {1} + \dots + X _ {n} }{n}
 +
} ,\ \
 +
S  ^ {2}  = \
 +
{
 +
\frac{1}{n - 1 }
 +
}
 +
\sum _ {i = 1 } ^ { n }
 +
( X _ {i} - \overline{X}\; )  ^ {2} .
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297043.png" /> is the distribution function of the standard normal law; moreover, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297044.png" /> does not depend on the unknown parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297045.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297046.png" />. The tolerance interval constructed in this way satisfies the following property: With confidence probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297047.png" /> the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297048.png" /> contains at least a proportion <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297049.png" /> of the probability mass of the normal distribution of the variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297050.png" />.
+
Specifically, one takes  $  T _ {1} = \overline{X}\; - kS  ^ {2} $
 +
and  $  T _ {2} = \overline{X}\; + kS  ^ {2} $,
 +
where the constant  $  k $,  
 +
called the tolerance multiplier, is obtained as the solution to the equation
  
Assuming the existence of a probability density function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297051.png" />, the probability of the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297052.png" /> is independent of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297053.png" /> if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297054.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297055.png" /> are order statistics (cf. [[Order statistic|Order statistic]]). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297056.png" /> be the vector of order statistics constructed from the sample <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297057.png" /> and let
+
$$
 +
{\mathsf P} \left \{
 +
\Phi \left (
 +
{
 +
\frac{\overline{X}\; + kS - a } \sigma
 +
}
 +
\right ) - \Phi \left (
 +
{
 +
\frac{\overline{X}\; - kS - a } \sigma
 +
}
 +
\right ) \geq  p
 +
\right \}  =  \gamma ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297058.png" /></td> </tr></table>
+
where  $  \Phi ( x) $
 +
is the distribution function of the standard normal law; moreover,  $  k = k ( n, \gamma , p) $
 +
does not depend on the unknown parameters  $  a $
 +
and  $  \sigma  ^ {2} $.
 +
The tolerance interval constructed in this way satisfies the following property: With confidence probability  $  \gamma $
 +
the interval  $  ( \overline{X}\; - kS  ^ {2} , \overline{X}\; + kS  ^ {2} ) $
 +
contains at least a proportion  $  p $
 +
of the probability mass of the normal distribution of the variables  $  X _ {1} \dots X _ {n} $.
  
Since the random variable <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297059.png" /> has the beta-distribution with parameters <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297060.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297061.png" />, the probability of the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297062.png" /> can be calculated as the integral <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297063.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297064.png" /> is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation
+
Assuming the existence of a probability density function  $  f ( x) = F ^ { \prime } ( x) $,  
 +
the probability of the event $  \{ F ( T _ {2} ) - F ( T _ {1} ) \geq  p \} $
 +
is independent of  $  F ( x) $
 +
if and only if  $  T _ {1} $
 +
and  $  T _ {2} $
 +
are order statistics (cf. [[Order statistic|Order statistic]]). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let  $  X  ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} ) $
 +
be the vector of order statistics constructed from the sample  $  X _ {1} \dots X _ {n} $
 +
and let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297065.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$
 +
T _ {1}  = X _ {( nr)} ,\ \
 +
T _ {2}  = X _ {( ns)} ,\ \
 +
1 \leq  r < s \leq  n.
 +
$$
  
which allows one, for given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297066.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297067.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297068.png" />, to define numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297069.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297070.png" /> so that the order statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297071.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297072.png" /> are the tolerance bounds of the desired tolerance interval. Moreover, for given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297073.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297074.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297075.png" />, relation (2) allows one to determine the size <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297076.png" /> of the collection <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092970/t09297077.png" /> necessary for the relation (2) to hold. There are statistical tables available for solving such problems.
+
Since the random variable  $  F ( X _ {( ns)} ) - F ( X _ {( nr)} ) $
 +
has the beta-distribution with parameters  $  s - r $
 +
and  $  n - s + r + 1 $,
 +
the probability of the event  $  \{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq  p \} $
 +
can be calculated as the integral  $  I _ {1 - p }  ( n - s + r + 1, s - r) $,
 +
where  $  I _ {x} ( a, b) $
 +
is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation
 +
 
 +
$$ \tag{2 }
 +
I _ {1 - p }  ( n - s + r + 1, s - r)  =  \gamma ,
 +
$$
 +
 
 +
which allows one, for given $  \gamma $,  
 +
$  p $
 +
and $  n $,  
 +
to define numbers $  r $
 +
and $  s $
 +
so that the order statistics $  X _ {( nr)} $
 +
and $  X _ {( ns)} $
 +
are the tolerance bounds of the desired tolerance interval. Moreover, for given $  \gamma $,  
 +
$  p $,  
 +
$  r $,  
 +
relation (2) allows one to determine the size $  n $
 +
of the collection $  X _ {1} \dots X _ {n} $
 +
necessary for the relation (2) to hold. There are statistical tables available for solving such problems.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  S.S. Wilks,  "Mathematical statistics" , Wiley  (1962)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  H.H. David,  "Order statistics" , Wiley  (1981)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  R.B. Murphy,  "Non-parametric tolerance limits"  ''Ann. Math. Stat.'' , '''19'''  (1948)  pp. 581–589</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  P.N. Somerville,  "Tables for obtaining non-parametric tolerance limits"  ''Ann. Math. Stat.'' , '''29'''  (1958)  pp. 599–601</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  H. Scheffé,  J.W. Tukey,  "Non-parametric estimation I. Validation of order statistics"  ''Ann. Math. Stat.'' , '''16'''  (1945)  pp. 187–192</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  D.A.S. Fraser,  "Nonparametric methods in statistics" , Wiley  (1957)</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top">  A. Wald,  J. Wolfowitz,  "Tolerance limits for a normal distribution"  ''Ann. Math. Stat.'' , '''17'''  (1946)  pp. 208–215</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top">  H. Robbins,  "On distribution-free tolerance limits in random sampling"  ''Ann. Math. Stat.'' , '''15'''  (1944)  pp. 214–216</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  S.S. Wilks,  "Mathematical statistics" , Wiley  (1962)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  H.H. David,  "Order statistics" , Wiley  (1981)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  R.B. Murphy,  "Non-parametric tolerance limits"  ''Ann. Math. Stat.'' , '''19'''  (1948)  pp. 581–589</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  P.N. Somerville,  "Tables for obtaining non-parametric tolerance limits"  ''Ann. Math. Stat.'' , '''29'''  (1958)  pp. 599–601</TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top">  H. Scheffé,  J.W. Tukey,  "Non-parametric estimation I. Validation of order statistics"  ''Ann. Math. Stat.'' , '''16'''  (1945)  pp. 187–192</TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top">  D.A.S. Fraser,  "Nonparametric methods in statistics" , Wiley  (1957)</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top">  A. Wald,  J. Wolfowitz,  "Tolerance limits for a normal distribution"  ''Ann. Math. Stat.'' , '''17'''  (1946)  pp. 208–215</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top">  H. Robbins,  "On distribution-free tolerance limits in random sampling"  ''Ann. Math. Stat.'' , '''15'''  (1944)  pp. 214–216</TD></TR></table>

Latest revision as of 10:30, 16 July 2021


Random intervals, constructed for independent identically-distributed random variables with unknown distribution function $ F ( x) $, containing with given probability $ \gamma $ at least a proportion $ p $ ($ 0 < p < 1 $) of the probability measure $ dF $.

Let $ X _ {1} \dots X _ {n} $ be independent and identically-distributed random variables with unknown distribution function $ F ( x) $, and let $ T _ {1} = T _ {1} ( X _ {1} \dots X _ {n} ) $, $ T _ {2} = T _ {2} ( X _ {1} \dots X _ {n} ) $ be statistics such that, for a number $ p $( $ 0 < p < 1 $) fixed in advance, the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) > p \} $ has a given probability $ \gamma $, that is,

$$ \tag{1 } {\mathsf P} \left \{ \int\limits _ { T _ {1} } ^ { {T _ 2 } } dF ( x) \geq p \right \} = \gamma . $$

In this case the random interval $ ( T _ {1} , T _ {2} ) $ is called a $ \gamma $- tolerance interval for the distribution function $ F ( x) $, its end points $ T _ {1} $ and $ T _ {2} $ are called tolerance bounds, and the probability $ \gamma $ is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds $ T _ {1} $ and $ T _ {2} $( i.e. with $ T _ {2} = + \infty $, respectively $ T _ {1} = - \infty $) are the usual one-sided confidence bounds with confidence coefficient $ \gamma $ for the quantiles $ x _ {1 - p } = F ^ { - 1 } ( 1 - p) $ and $ x _ {p} = F ^ { - 1 } ( p) $, respectively, that is,

$$ {\mathsf P} \{ x _ {1 - p } \in [ T _ {1} , + \infty ) \} = \gamma , $$

$$ {\mathsf P} \{ x _ {p} \in (- \infty , T _ {2} ] \} = \gamma . $$

Example. Let $ X _ {1} \dots X _ {n} $ be independent random variables having a normal distribution $ N ( a, \sigma ^ {2} ) $ with unknown parameters $ a $ and $ \sigma ^ {2} $. In this case it is natural to take the tolerance bounds $ T _ {1} $ and $ T _ {2} $ to be functions of the sufficient statistic $ ( \overline{X}\; , S ^ {2} ) $, where

$$ \overline{X}\; = \ { \frac{X _ {1} + \dots + X _ {n} }{n} } ,\ \ S ^ {2} = \ { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - \overline{X}\; ) ^ {2} . $$

Specifically, one takes $ T _ {1} = \overline{X}\; - kS ^ {2} $ and $ T _ {2} = \overline{X}\; + kS ^ {2} $, where the constant $ k $, called the tolerance multiplier, is obtained as the solution to the equation

$$ {\mathsf P} \left \{ \Phi \left ( { \frac{\overline{X}\; + kS - a } \sigma } \right ) - \Phi \left ( { \frac{\overline{X}\; - kS - a } \sigma } \right ) \geq p \right \} = \gamma , $$

where $ \Phi ( x) $ is the distribution function of the standard normal law; moreover, $ k = k ( n, \gamma , p) $ does not depend on the unknown parameters $ a $ and $ \sigma ^ {2} $. The tolerance interval constructed in this way satisfies the following property: With confidence probability $ \gamma $ the interval $ ( \overline{X}\; - kS ^ {2} , \overline{X}\; + kS ^ {2} ) $ contains at least a proportion $ p $ of the probability mass of the normal distribution of the variables $ X _ {1} \dots X _ {n} $.

Assuming the existence of a probability density function $ f ( x) = F ^ { \prime } ( x) $, the probability of the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) \geq p \} $ is independent of $ F ( x) $ if and only if $ T _ {1} $ and $ T _ {2} $ are order statistics (cf. Order statistic). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let $ X ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} ) $ be the vector of order statistics constructed from the sample $ X _ {1} \dots X _ {n} $ and let

$$ T _ {1} = X _ {( nr)} ,\ \ T _ {2} = X _ {( ns)} ,\ \ 1 \leq r < s \leq n. $$

Since the random variable $ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) $ has the beta-distribution with parameters $ s - r $ and $ n - s + r + 1 $, the probability of the event $ \{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq p \} $ can be calculated as the integral $ I _ {1 - p } ( n - s + r + 1, s - r) $, where $ I _ {x} ( a, b) $ is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation

$$ \tag{2 } I _ {1 - p } ( n - s + r + 1, s - r) = \gamma , $$

which allows one, for given $ \gamma $, $ p $ and $ n $, to define numbers $ r $ and $ s $ so that the order statistics $ X _ {( nr)} $ and $ X _ {( ns)} $ are the tolerance bounds of the desired tolerance interval. Moreover, for given $ \gamma $, $ p $, $ r $, relation (2) allows one to determine the size $ n $ of the collection $ X _ {1} \dots X _ {n} $ necessary for the relation (2) to hold. There are statistical tables available for solving such problems.

References

[1] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[2] S.S. Wilks, "Mathematical statistics" , Wiley (1962)
[3] H.H. David, "Order statistics" , Wiley (1981)
[4] R.B. Murphy, "Non-parametric tolerance limits" Ann. Math. Stat. , 19 (1948) pp. 581–589
[5] P.N. Somerville, "Tables for obtaining non-parametric tolerance limits" Ann. Math. Stat. , 29 (1958) pp. 599–601
[6] H. Scheffé, J.W. Tukey, "Non-parametric estimation I. Validation of order statistics" Ann. Math. Stat. , 16 (1945) pp. 187–192
[7] D.A.S. Fraser, "Nonparametric methods in statistics" , Wiley (1957)
[8] A. Wald, J. Wolfowitz, "Tolerance limits for a normal distribution" Ann. Math. Stat. , 17 (1946) pp. 208–215
[9] H. Robbins, "On distribution-free tolerance limits in random sampling" Ann. Math. Stat. , 15 (1944) pp. 214–216
How to Cite This Entry:
Tolerance intervals. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Tolerance_intervals&oldid=18366
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article