A probability distribution determined from a sample for the estimation of a true distribution. Suppose that results of observations are independent identically-distributed random variables with distribution function and let be the corresponding order statistics. The empirical distribution corresponding to is defined as the discrete distribution that assigns to every value the probability . The empirical distribution function is the step-function with steps of multiples of at the points defined by :
For fixed values of the function has all the properties of an ordinary distribution function. For every fixed real the function is a random variable as a function of . Thus, the empirical distribution corresponding to a sample is given by the family of random variables depending on the real parameter . Here for a fixed ,
In accordance with the law of large numbers, with probability one as for every . This means that is an unbiased and consistent estimator of the distribution function . The empirical distribution function converges, uniformly in , with probability 1 to as , i.e., if
(the Glivenko–Cantelli theorem).
The quantity is a measure of the proximity of to . A.N. Kolmogorov found (in 1933) its limit distribution: For a continuous function ,
If is not known, then to verify the hypothesis that it is a given continuous function one uses tests based on statistics of type (see Kolmogorov test; Kolmogorov–Smirnov test; Non-parametric methods in statistics).
Moments and any other characteristics of an empirical distribution are called sample or empirical; for example, is the sample mean, is the sample variance, and is the sample moment of order .
Sample characteristics serve as statistical estimators of the corresponding characteristics of the original distribution.
|||L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)|
|||B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)|
|||A.A. Borovkov, "Mathematical statistics" , Moscow (1984) (In Russian)|
The use of the empirical distribution in statistics and the associated theory has been greatly developed in recent years. This has been surveyed in [a2]. For the developments in strong convergence theory associated with the empirical distribution see [a1].
|[a1]||M. Csörgö, P. Révész, "Strong approximation in probability and statistics" , Acad. Press (1981)|
|[a2]||G.R. Shorack, J.A. Wellner, "Empirical processes with applications to statistics" , Wiley (1986)|
|[a3]||M. Loève, "Probability theory" , Princeton Univ. Press (1963) pp. Sect. 16.3|
|[a4]||P. Gaenssler, W. Stute, "Empirical processes: a survey of results for independent and identically distributed random variables" Ann. Prob. , 7 (1977) pp. 193–243|
Empirical distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Empirical_distribution&oldid=11280