# Normal distribution

2010 Mathematics Subject Classification: Primary: 60E99 [MSN][ZBL]

One of the most important probability distributions. The term "normal distribution" is due to K. Pearson (earlier names are Gauss law and Gauss–Laplace distribution). It is used both in relation to probability distributions of random variables (cf. Random variable) and in relation to the joint probability distribution (cf. Joint distribution) of several random variables (that is, to distributions of finite-dimensional random vectors), as well as of random elements and stochastic processes (cf. Random element; Stochastic process). The general definition of a normal distribution reduces to the one-dimensional case.

The probability distribution of a random variable $X$ is called normal if it has probability density

$$\tag{* } p ( x; a, \sigma ) = \ \frac{1}{\sigma \sqrt {2 \pi } } e ^ {- ( x - a) ^ {2} /2 \sigma ^ {2} } .$$

The family of normal distributions (*) depends, as a rule, on the two parameters $a$ and $\sigma > 0$. Here $a$ is the mathematical expectation of $X$, $\sigma ^ {2}$ is the variance of $X$ and the characteristic function has the form

$$f ( t) = \ {\mathsf E} e ^ {itX} = \ e ^ {iat - \sigma ^ {2} t ^ {2} /2 } .$$

The normal density curve $y = p ( x; a, \sigma )$ is symmetric about the ordinate passing through $a$ and has there its unique maximum $1 / ( \sigma \sqrt {2 \pi } )$. As $\sigma$ decreases, the normal distribution curve becomes more and more pointed. A change in $a$ with constant $\sigma$ does not change the shape of the curve and causes only a shift along the $x$- axis. The area under a normal density curve is 1. When $a = 0$ and $\sigma = 1$, the corresponding distribution function is

$$\Phi ( x) = \ { \frac{1}{\sqrt {2 \pi } } } \int\limits _ {- \infty } ^ { x } e ^ {- u ^ {2} /2 } du.$$

In general, the distribution function $F ( x; a, \sigma )$ of (*) can be computed by the formula $F ( x; a, \sigma ) = \Phi ( t)$, where $t = ( x - a)/ \sigma$. For $\Phi ( t)$( and several of its derivatives) extensive tables have been compiled (see, for example, [BS], [T], and Probability integral). For a normal distribution the probability that $| X - a | > k \sigma$ is $1 - \Phi ( k) + \Phi (- k)$ and it decreases very rapidly with increasing $k$(

see the Table).

<tbody> </tbody>
 k probability 1 0.31731 2 0.45500 $\cdot 10 ^ {-} 1$ 3 0.26998 $\cdot 10 ^ {-} 2$ 4 0.63342 $\cdot 10 ^ {-} 4$

In many practical problems, when analyzing normal distributions one can, therefore, ignore the possibility of a deviation from $a$ in excess of $3 \sigma$— the three-sigma rule; the corresponding probability, as is clear from the Table, is less than 0.003. The quartile deviation for a normal distribution is $0.67449 \sigma$.

Normal distributions occur in a large number of applications. There are some noteable attempts at explaining this fact. A theoretical basis for the exceptional role of the normal distribution is given by the limit theorems of probability theory (see also Laplace theorem; Lyapunov theorem). Qualitatively, the result can be stated in the following manner: A normal distribution is a good approximation whenever the relevant random variable is the sum of a large number of independent random variables the largest of which is small in comparison with the whole sum (see Central limit theorem).

A normal distribution can also appear as an exact solution of certain problems (within the framework of an accepted mathematical model of the phenomenon). This is so in the theory of random processes (in one of the basic models of Brownian motion). Classic examples of a normal distribution arising as an exact one are due to C.F. Gauss (the law of distribution of errors of observation) and J. Maxwell (the law of distribution of velocities of molecules) (see also Independence; Characterization theorems).

The distribution of a random vector $X = ( X _ {1} \dots X _ {n} )$ in $\mathbf R ^ {n}$, or the joint distribution of random variables $X _ {1} \dots X _ {n}$, is called normal (multivariate normal) if for any fixed $t \in \mathbf R ^ {n}$ the scalar product $( t, X)$ either has a normal distribution or is constant (as one sometimes says, has a normal distribution with variance zero). For random elements with values from some vector space $E$ this definition is retained when $t$ is replaced by any element $l$ of the adjoint space $E ^ {*}$ and the scalar product $( t, X)$ is replaced by a linear functional $l ( X)$. The joint distribution of several random variables $X _ {1} \dots X _ {n}$ has characteristic function

$$f ( t) = \ \mathop{\rm exp} \left \{ i {\mathsf E} ( t, X) - { \frac{1}{2} } Q ( t) \right \} ,$$

$$t = ( t _ {1} \dots t _ {n} ) \in \mathbf R ^ {n} ,$$

where

$${\mathsf E} ( t, X) = \ t _ {1} {\mathsf E} X _ {1} + \dots + t _ {n} {\mathsf E} X _ {n}$$

is a linear form,

$$Q ( t) = \ {\mathsf E} ( t, X - {\mathsf E} X) ^ {2} = \ \sum _ {k, l = 1 } ^ { n } \sigma _ {kl} t _ {k} t _ {l}$$

is a non-negative definite quadratic form, and $\| \sigma _ {kl} \|$ is the covariance matrix of $X$. In the positive-definite case the corresponding normal distribution has the probability density

$$p ( x _ {1} \dots x _ {n} ) = \ C \mathop{\rm exp} \{ - Q ^ {-} 1 ( x _ {1} - a _ {1} \dots x _ {n} - a _ {n} ) \} ,$$

where $Q ^ {-} 1$ is the quadratic form inverse to $Q$, the parameters $a _ {1} \dots a _ {n}$ are the mathematical expectations of $X _ {1} \dots X _ {n}$, respectively, and

$$C = { \frac{1}{( 2 \pi ) ^ {n/2} \sqrt { \mathop{\rm det} \| \sigma _ {kl} \| } } }$$

is constant. The total number of parameters specifying the normal distribution is

$${ \frac{( n + 1) ( n + 2) }{2} } - 1$$

and grows rapidly with $n$( it is 2 for $n = 1$, 20 for $n = 5$, and 65 for $n = 10$). A multivariate normal distribution is the basic model of multi-dimensional statistical analysis. It is also used in the theory of stochastic processes (where normal distributions in infinite-dimensional spaces are examined; see Random element, and also Wiener measure; Wiener process; Gaussian process).

Of the important properties of normal distributions the following should be mentioned. The sum $X$ of two independent random variables $X _ {1}$ and $X _ {2}$ having normal distributions also has a normal distribution; conversely, if $X = X _ {1} + X _ {2}$ has a normal distribution and $X _ {1}$ and $X _ {2}$ are independent, then the distributions of $X _ {1}$ and $X _ {2}$ are normal (Cramér's theorem). This property has a certain "stability" : If the distribution of $X$ is "close" to normal, then so are the distributions of $X _ {1}$ and $X _ {2}$. Some other important distributions are connected with normal ones (see Logarithmic normal distribution; Non-central "chi-squared" distribution; Student distribution; Wishart distribution; Fisher $z$- distribution; Hotelling $T ^ {2}$- distribution; Chi-squared distribution). For an approximate representation of distributions close to normal, series like Edgeworth series and Gram–Charlier series are widely used.

Concerning problems connected with estimators of parameters of normal distributions using results of observations see Unbiased estimator. Concerning testing the hypothesis of normality see Non-parametric methods in statistics. See also Probability graph paper.

How to Cite This Entry:
Normal distribution. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Normal_distribution&oldid=48011
This article was adapted from an original article by Yu.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article