Namespaces
Variants
Actions

Difference between revisions of "Tolerance intervals"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (tex encoded by computer)
m (fix tex)
 
Line 13: Line 13:
 
Random intervals, constructed for independent identically-distributed random variables with unknown distribution function  $  F ( x) $,  
 
Random intervals, constructed for independent identically-distributed random variables with unknown distribution function  $  F ( x) $,  
 
containing with given probability  $  \gamma $
 
containing with given probability  $  \gamma $
at least a proportion  $  p $(
+
at least a proportion  $  p $ ($  0 < p < 1 $)  
$  0 < p < 1 $)  
 
 
of the probability measure  $  dF $.
 
of the probability measure  $  dF $.
  
Line 63: Line 62:
 
In this case it is natural to take the tolerance bounds  $  T _ {1} $
 
In this case it is natural to take the tolerance bounds  $  T _ {1} $
 
and  $  T _ {2} $
 
and  $  T _ {2} $
to be functions of the sufficient statistic  $  ( \overline{X}\; , S  ^ {2} ) $,  
+
to be functions of the [[sufficient statistic]] $  ( \overline{X}\; , S  ^ {2} ) $,  
 
where
 
where
  
Line 112: Line 111:
 
if and only if  $  T _ {1} $
 
if and only if  $  T _ {1} $
 
and  $  T _ {2} $
 
and  $  T _ {2} $
are order statistics (cf. [[Order statistic|Order statistic]]). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let  $  X  ^ {(*)} = ( X _ {(} n1) \dots X _ {(} nn) ) $
+
are order statistics (cf. [[Order statistic|Order statistic]]). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let  $  X  ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} ) $
 
be the vector of order statistics constructed from the sample  $  X _ {1} \dots X _ {n} $
 
be the vector of order statistics constructed from the sample  $  X _ {1} \dots X _ {n} $
 
and let
 
and let
  
 
$$  
 
$$  
T _ {1}  =  X _ {(} nr) ,\ \  
+
T _ {1}  =  X _ {( nr)} ,\ \  
T _ {2}  =  X _ {(} ns) ,\ \  
+
T _ {2}  =  X _ {( ns)} ,\ \  
 
1 \leq  r < s \leq  n.
 
1 \leq  r < s \leq  n.
 
$$
 
$$
  
Since the random variable  $  F ( X _ {(} ns) ) - F ( X _ {(} nr) ) $
+
Since the random variable  $  F ( X _ {( ns)} ) - F ( X _ {( nr)} ) $
 
has the beta-distribution with parameters  $  s - r $
 
has the beta-distribution with parameters  $  s - r $
 
and  $  n - s + r + 1 $,  
 
and  $  n - s + r + 1 $,  
the probability of the event  $  \{ F ( X _ {(} ns) ) - F ( X _ {(} nr) ) \geq  p \} $
+
the probability of the event  $  \{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq  p \} $
 
can be calculated as the integral  $  I _ {1 - p }  ( n - s + r + 1, s - r) $,  
 
can be calculated as the integral  $  I _ {1 - p }  ( n - s + r + 1, s - r) $,  
 
where  $  I _ {x} ( a, b) $
 
where  $  I _ {x} ( a, b) $
Line 139: Line 138:
 
to define numbers  $  r $
 
to define numbers  $  r $
 
and  $  s $
 
and  $  s $
so that the order statistics  $  X _ {(} nr) $
+
so that the order statistics  $  X _ {( nr)} $
and  $  X _ {(} ns) $
+
and  $  X _ {( ns)} $
 
are the tolerance bounds of the desired tolerance interval. Moreover, for given  $  \gamma $,  
 
are the tolerance bounds of the desired tolerance interval. Moreover, for given  $  \gamma $,  
 
$  p $,  
 
$  p $,  

Latest revision as of 10:30, 16 July 2021


Random intervals, constructed for independent identically-distributed random variables with unknown distribution function $ F ( x) $, containing with given probability $ \gamma $ at least a proportion $ p $ ($ 0 < p < 1 $) of the probability measure $ dF $.

Let $ X _ {1} \dots X _ {n} $ be independent and identically-distributed random variables with unknown distribution function $ F ( x) $, and let $ T _ {1} = T _ {1} ( X _ {1} \dots X _ {n} ) $, $ T _ {2} = T _ {2} ( X _ {1} \dots X _ {n} ) $ be statistics such that, for a number $ p $( $ 0 < p < 1 $) fixed in advance, the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) > p \} $ has a given probability $ \gamma $, that is,

$$ \tag{1 } {\mathsf P} \left \{ \int\limits _ { T _ {1} } ^ { {T _ 2 } } dF ( x) \geq p \right \} = \gamma . $$

In this case the random interval $ ( T _ {1} , T _ {2} ) $ is called a $ \gamma $- tolerance interval for the distribution function $ F ( x) $, its end points $ T _ {1} $ and $ T _ {2} $ are called tolerance bounds, and the probability $ \gamma $ is called a confidence coefficient. It follows from (1) that the one-sided tolerance bounds $ T _ {1} $ and $ T _ {2} $( i.e. with $ T _ {2} = + \infty $, respectively $ T _ {1} = - \infty $) are the usual one-sided confidence bounds with confidence coefficient $ \gamma $ for the quantiles $ x _ {1 - p } = F ^ { - 1 } ( 1 - p) $ and $ x _ {p} = F ^ { - 1 } ( p) $, respectively, that is,

$$ {\mathsf P} \{ x _ {1 - p } \in [ T _ {1} , + \infty ) \} = \gamma , $$

$$ {\mathsf P} \{ x _ {p} \in (- \infty , T _ {2} ] \} = \gamma . $$

Example. Let $ X _ {1} \dots X _ {n} $ be independent random variables having a normal distribution $ N ( a, \sigma ^ {2} ) $ with unknown parameters $ a $ and $ \sigma ^ {2} $. In this case it is natural to take the tolerance bounds $ T _ {1} $ and $ T _ {2} $ to be functions of the sufficient statistic $ ( \overline{X}\; , S ^ {2} ) $, where

$$ \overline{X}\; = \ { \frac{X _ {1} + \dots + X _ {n} }{n} } ,\ \ S ^ {2} = \ { \frac{1}{n - 1 } } \sum _ {i = 1 } ^ { n } ( X _ {i} - \overline{X}\; ) ^ {2} . $$

Specifically, one takes $ T _ {1} = \overline{X}\; - kS ^ {2} $ and $ T _ {2} = \overline{X}\; + kS ^ {2} $, where the constant $ k $, called the tolerance multiplier, is obtained as the solution to the equation

$$ {\mathsf P} \left \{ \Phi \left ( { \frac{\overline{X}\; + kS - a } \sigma } \right ) - \Phi \left ( { \frac{\overline{X}\; - kS - a } \sigma } \right ) \geq p \right \} = \gamma , $$

where $ \Phi ( x) $ is the distribution function of the standard normal law; moreover, $ k = k ( n, \gamma , p) $ does not depend on the unknown parameters $ a $ and $ \sigma ^ {2} $. The tolerance interval constructed in this way satisfies the following property: With confidence probability $ \gamma $ the interval $ ( \overline{X}\; - kS ^ {2} , \overline{X}\; + kS ^ {2} ) $ contains at least a proportion $ p $ of the probability mass of the normal distribution of the variables $ X _ {1} \dots X _ {n} $.

Assuming the existence of a probability density function $ f ( x) = F ^ { \prime } ( x) $, the probability of the event $ \{ F ( T _ {2} ) - F ( T _ {1} ) \geq p \} $ is independent of $ F ( x) $ if and only if $ T _ {1} $ and $ T _ {2} $ are order statistics (cf. Order statistic). Precisely this fact is the basis of a general method for constructing non-parametric, or distribution-free, tolerance intervals. Let $ X ^ {(*)} = ( X _ {( n1)} \dots X _ {( nn)} ) $ be the vector of order statistics constructed from the sample $ X _ {1} \dots X _ {n} $ and let

$$ T _ {1} = X _ {( nr)} ,\ \ T _ {2} = X _ {( ns)} ,\ \ 1 \leq r < s \leq n. $$

Since the random variable $ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) $ has the beta-distribution with parameters $ s - r $ and $ n - s + r + 1 $, the probability of the event $ \{ F ( X _ {( ns)} ) - F ( X _ {( nr)} ) \geq p \} $ can be calculated as the integral $ I _ {1 - p } ( n - s + r + 1, s - r) $, where $ I _ {x} ( a, b) $ is the incomplete beta-function, and hence in this case instead of (1) one obtains the relation

$$ \tag{2 } I _ {1 - p } ( n - s + r + 1, s - r) = \gamma , $$

which allows one, for given $ \gamma $, $ p $ and $ n $, to define numbers $ r $ and $ s $ so that the order statistics $ X _ {( nr)} $ and $ X _ {( ns)} $ are the tolerance bounds of the desired tolerance interval. Moreover, for given $ \gamma $, $ p $, $ r $, relation (2) allows one to determine the size $ n $ of the collection $ X _ {1} \dots X _ {n} $ necessary for the relation (2) to hold. There are statistical tables available for solving such problems.

References

[1] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[2] S.S. Wilks, "Mathematical statistics" , Wiley (1962)
[3] H.H. David, "Order statistics" , Wiley (1981)
[4] R.B. Murphy, "Non-parametric tolerance limits" Ann. Math. Stat. , 19 (1948) pp. 581–589
[5] P.N. Somerville, "Tables for obtaining non-parametric tolerance limits" Ann. Math. Stat. , 29 (1958) pp. 599–601
[6] H. Scheffé, J.W. Tukey, "Non-parametric estimation I. Validation of order statistics" Ann. Math. Stat. , 16 (1945) pp. 187–192
[7] D.A.S. Fraser, "Nonparametric methods in statistics" , Wiley (1957)
[8] A. Wald, J. Wolfowitz, "Tolerance limits for a normal distribution" Ann. Math. Stat. , 17 (1946) pp. 208–215
[9] H. Robbins, "On distribution-free tolerance limits in random sampling" Ann. Math. Stat. , 15 (1944) pp. 214–216
How to Cite This Entry:
Tolerance intervals. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Tolerance_intervals&oldid=51752
This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article