Variance
in probability theory
The measure of the deviation of a random variable
from its mathematical expectation
defined by the equation:
![]() | (1) |
The properties of the variance are:
![]() |
if is a real number, then
![]() |
in particular, .
In speaking of the variance of a random variable , it is always assumed that its expectation
exists; the variance
may exist (i.e. be finite) or may not (i.e. be infinite). In modern probability theory the expectation of a random variable is defined in terms of the Lebesgue integral over the sample space. However, formulas expressing the expectation of various functions of a random variable
in terms of the distribution of this variable on the set of real numbers are of importance (cf. Mathematical expectation). For the variance
these formulas are
a)
![]() |
for a discrete random variable which assumes at most a countable number of different values
with probabilities
;
b)
![]() |
for a random variable with a density
of the probability distribution;
c)
![]() |
in the integral case; here is the distribution function of the random variable
, and the integral is understood in the sense of Lebesgue–Stieltjes or Riemann–Stieltjes.
The variance is not the only conceivable measure of the deviation of a random variable from its expectation. Other measures of the deviation, constructed on the same principle, e.g. ,
, etc., are also possible, as are measures of deviation based on quantiles (cf. Quantile). The importance of the variance is mainly due to the role played by this concept in limit theorems. Roughly speaking, one may say that if the expectation and variance of the sum of a large number of random variables are known, it is possible to describe completely the distribution law of this sum: It is (approximately) normal, with corresponding parameters (cf. Normal distribution). Thus, the most important properties of the variance are connected with the expression for the variance
of the sum of random variables
:
![]() |
where
![]() |
denotes the covariance of the random variables and
. If the random variables
are pairwise independent, then
. Accordingly, the equation
![]() | (2) |
is valid for pairwise independent random variables. The converse proposition is not valid: (2) does not entail independence. Nevertheless, the utilization of (2) is usually based on the independence of the random variables. Strictly speaking, a sufficient condition for the validity of (2) is that , i.e. the random variables
need to be pairwise uncorrelated.
The applications of the concept of the variance have had two directions of development. The first is in the limit theorems of probability theory. If, for a sequence of random variables one has
as
, then for any
,
![]() |
as (cf. Chebyshev inequality in probability theory), i.e. if
is large the random variable
becomes practically identical with the non-random variable
. The development of these concepts yields a proof of the law of large numbers, of the consistency of estimators (cf. Consistent estimator) in mathematical statistics, and also leads to other applications in which convergence in probability is established for random variables. Another application to limit theorems is connected with the concept of normalization. Normalization of a random variable
is effected by subtracting the expectation and dividing by the square root of the variance
; in other words, the variable
is considered. Normalization of a sequence of random variables is usually necessary in order to obtain a convergent sequence of distribution laws, in particular, convergence to the normal law with parameters zero and one. The second direction consists in the application of the concept of the variance in mathematical statistics to sample processing. If a random variable is considered as the realization of a random experiment, an arbitrary change in the numerical scale converts the random variable
to
, where
is an arbitrary random number and
is a positive number. It is accordingly meaningful, in many cases, to consider not the one theoretical distribution law
of the random variable
alone, but rather the type of the law, i.e. the family of distribution laws of the type $F((x-a)/\sigma)$, which is a function of at least two parameters
and
. If
,
, then
and
. Accordingly, the meaning of the parameters in the theoretical law is
and
. This makes it possible to determine these parameters by sampling.
References
[1] | B.V. Gnedenko, "The theory of probability" , Chelsea, reprint (1962) (Translated from Russian) |
[2] | W. Feller, "An introduction to probability theory and its applications" , 1–2 , Wiley (1957–1971) |
[3] | H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946) |
Comments
Dispersion is usually termed variance in English, and one accordingly uses instead of
.
Variance. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Variance&oldid=20988