# Difference between revisions of "Covariance"

$\DeclareMathOperator{\cov}{cov}$ $\DeclareMathOperator{\var}{var}$ $\DeclareMathOperator{\E}{\mathbf{E}}$

A numerical characteristic of the joint distribution of two random variables, equal to the mathematical expectation of the product of the deviations of these two random variables from their mathematical expectations. The covariance is defined for random variables $X_1$ and $X_2$ with finite variance and is usually denoted by $\cov(X_1, X_2)$. Thus,

$\cov(X_1, X_2) = \E[(X_1 - \E X_1)(X_2 - \E X_2)],$

so that $\cov(X_1, X_2) = \cov(X_2, X_1)$; $\cov(X, X) = DX = \var(X)$. The covariance naturally occurs in the expression for the variance of the sum of two random variables: $D(X_1 + X_2) = DX_1 + DX_2 + 2 \cov(X_1, X_2).$

If $X_1$ and $X_2$ are independent random variables, then $\cov(X_1, X_2)=0$. The covariance gives a characterization of the dependence of the random variables; the correlation coefficient is defined by means of the covariance. In order to statistically estimate the covariance one uses the sample covariance, computed from the formula

$\frac{1}{n - 1}\sum\limits_{i = 1}^n {(X_1^{(i)} - \overline X_1)(X_2^{(i)} - \overline X_2)}$ where the $(X_1^{(i)},X_2^{(i)}) , i = 1, \dots, n$, are independent random variables and $\overline X_1$ and $\overline X_2$ are their arithmetic means.

In the Western literature one always uses $V(x)$ or $\var(X)$ for the variance, instead of $D(X)$.