Namespaces
Variants
Actions

Pearson product-moment correlation coefficient

From Encyclopedia of Mathematics
Jump to: navigation, search

While the modern theory of correlation and regression has its roots in the work of F. Galton, the version of the product-moment correlation coefficient in current use (2000) is due to K. Pearson [a2]. Pearson's product-moment correlation coefficient is a measure of the strength of a linear relationship between two random variables X and Y (cf. also Random variable) with means \mu_x=\mathsf{E}(X), \mu_y=\mathsf{E}(Y) and finite variances \sigma^2_x=\text{var}(X), \sigma^2_y=\text{var}(Y):

\begin{equation}\rho=\text{corr}(X,Y)=\frac{\text{cov}(X,Y)}{\sigma_x\:\sigma_y},\end{equation}

where \text{cov}(X,Y) is the covariance of X and Y,

\begin{equation}\text{cov}(X,Y)=\mathsf{E}[(X-\mu_x)(Y-\mu_y)]=\mathsf{E}(XY)-\mu_x\:\mu_y.\end{equation}

It readily follows that -1\leq\rho\leq+1, and that \rho is equal to -1 or +1 if and only if each of X and Y is almost surely a linear function of the other, i.e., Y=\alpha+\beta X(\beta\neq0) (1) with probability 1 (furthermore, \rho and \beta have the same sign). If \rho=0, X and Y are said to be uncorrelated. Independent random variables are always uncorrelated, however uncorrelated random variables need not be independent (cf. also Independence).

The term "product-moment" refers to the observation that \rho=\mu_{11}/\sqrt{\mu_{20}\:\mu_{02}}, where \mu_{ij}=\mathsf{E}[(X-\mu_x)^i(Y-\mu_y)^j] denotes the (i,j)th product moment of X and Y about their means.

The coefficient \rho also plays a role in linear regression (cf. also Regression analysis). If the regression of Y on X is linear, then y=\mathsf{E}(Y|X=x)=\mu_y+\rho(\sigma_y/\sigma_x)(x-\mu_x), and if the regression of X on Y is linear, then x=\mathsf{E}(X|Y=y)=\mu_x+\rho(\sigma_x/\sigma_y)(y-\mu_y). Note that the product of the two slopes is \rho^2.

When X and Y have a bivariate normal distribution (cf. also Normal distribution), \rho is a parameter of the joint density function

\begin{equation}\phi(x,y)=\frac{1}{2\pi\:\sigma_x\:\sigma_y\sqrt{1-\rho^2}}\exp\bigg[\frac{-1}{2(1-\rho^2)}Q\bigg],\\-\infty<x,y<\infty,\end{equation}

with

\begin{equation}=\bigg(\frac{x-\mu_x}{\sigma_x}\bigg)-2\rho\bigg(\frac{x-\mu_x}{\sigma_x}\bigg)\bigg(\frac{y-\mu_y}{\sigma_y}\bigg)+\bigg(\frac{y-\mu_y}{\sigma_y}\bigg)^2\end{equation}

Unlike the general situation, uncorrelated random variables with a bivariate normal distribution are independent.

For a random sample \{(x_i,y_i)\}^n_{i=1} from a bivariate population, \rho is estimated by the sample correlation coefficient (cf. also Correlation coefficient) r, given by

\begin{equation}r=\frac{\sum^n_{i=1}(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum^n_{i=1}(x_i-\overline{x})^2\sum^n_{i=1}(y_i-\overline{y})^2}}.\end{equation}

If x and y denote, respectively, the vectors (x_1-\overline{x},...,x_n-\overline{x}) and (y_1-\overline{y},...,y_n-\overline{y}), and \theta denotes the angle between x and y, then

\begin{equation}r=\frac{xy}{|x||y|}=\cos\theta\end{equation}

Further interpretations of r can be found in [a3]. For details on the use of r in hypothesis testing, and for large-sample theory, see [a1].

References

[a1] O.J. Dunn, V.A. Clark, "Applied statistics: analysis of variance and regression" , Wiley (1974)
[a2] K. Pearson, "Mathematical contributions to the theory of evolution. III. Regression, heredity and panmixia" Philos. Trans. Royal Soc. London Ser. A , 187 (1896) pp. 253–318
[a3] J.L. Rodgers, W.A. Nicewander, "Thirteen ways to look at the correlation coefficient" The Amer. Statistician , 42 (1988) pp. 59–65
How to Cite This Entry:
Pearson product-moment correlation coefficient. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Pearson_product-moment_correlation_coefficient&oldid=53125
This article was adapted from an original article by R.B. Nelsen (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article