Namespaces
Variants
Actions

Multiple-correlation coefficient

From Encyclopedia of Mathematics
Jump to: navigation, search


A measure of the linear dependence between one random variable and a certain collection of random variables. More precisely, if $ ( X _ {1} \dots X _ {k} ) $ is a random vector with values in $ \mathbf R ^ {k} $, then the multiple-correlation coefficient between $ X _ {1} $ and $ X _ {2} \dots X _ {k} $ is defined as the usual correlation coefficient between $ X _ {1} $ and its best linear approximation $ {\mathsf E} ( X _ {1} \mid X _ {2} \dots X _ {k} ) $ relative to $ X _ {2} \dots X _ {k} $, i.e. as its regression relative to $ X _ {2} \dots X _ {k} $. The multiple-correlation coefficient has the property that if $ {\mathsf E} X _ {1} = \dots = {\mathsf E} X _ {k} = 0 $ and if

$$ X _ {1} ^ {*} = \ \beta _ {2} X _ {2} + \dots + \beta _ {k} X _ {k} $$

is the regression of $ X _ {1} $ relative to $ X _ {2} \dots X _ {k} $, then among all linear combinations of $ X _ {2} \dots X _ {k} $ the variable $ X _ {1} ^ {*} $ has largest correlation with $ X _ {1} $. In this sense the multiple-correlation coefficient is a special case of the canonical correlation coefficient (cf. Canonical correlation coefficients). For $ k = 2 $ the multiple-correlation coefficient is the absolute value of the usual correlation coefficient $ \rho _ {12} $ between $ X _ {1} $ and $ X _ {2} $. The multiple-correlation coefficient between $ X _ {1} $ and $ X _ {2} \dots X _ {k} $ is denoted by $ \rho _ {1 \cdot ( 2 \dots k ) } $ and is expressed in terms of the entries of the correlation matrix $ R = \| \rho _ {ij} \| $, $ i , j = 1 \dots k $, by

$$ \rho _ {1 \cdot ( 2 \dots k ) } ^ {2} = 1 - \frac{| R | }{R _ {11} } , $$

where $ | R | $ is the determinant of $ R $ and $ R _ {11} $ is the cofactor of $ \rho _ {11} = 1 $; here $ 0 \leq \rho _ {1 \cdot ( 2 \dots k) } \leq 1 $. If $ \rho _ {1 \cdot ( 2 \dots k ) } = 1 $, then, with probability $ 1 $, $ X _ {1} $ is equal to a linear combination of $ X _ {2} \dots X _ {k} $, that is, the joint distribution of $ X _ {1} \dots X _ {k} $ is concentrated on a hyperplane in $ \mathbf R ^ {k} $. On the other hand, $ \rho _ {1 \cdot ( 2 \dots k ) } = 0 $ if and only if $ \rho _ {12} = \dots = \rho _ {1k} = 0 $, that is, if $ X _ {1} $ is not correlated with any of $ X _ {2} \dots X _ {k} $. To calculate the multiple-correlation coefficient one can use the formula

$$ \rho _ {1 \cdot ( 2 \dots k ) } ^ {2} = 1 - \frac{\sigma _ {1 \cdot ( 2 \dots k ) } ^ {2} }{\sigma _ {1} ^ {2} } , $$

where $ \sigma _ {1} ^ {2} $ is the variance of $ X _ {1} $ and

$$ \sigma _ {1 \cdot ( 2 \dots k ) } ^ {2} = {\mathsf E} [ X _ {1} - ( \beta _ {2} X _ {2} + \dots + \beta _ {k} X _ {k} ) ] ^ {2} $$

is the variance of $ X _ {1} $ with respect to the regression.

The sample analogue of the multiple-correlation coefficient $ \rho _ {1 \cdot ( 2 \dots k ) } $ is

$$ r _ {1 \cdot ( 2 \dots k ) } = \ \sqrt {1 - \frac{s _ {1 \cdot ( 2 \dots k ) } ^ {2} }{s _ {1} ^ {2} } } , $$

where $ s _ {1 \cdot ( 2 \dots k ) } ^ {2} $ and $ s _ {1} ^ {2} $ are estimators of $ \sigma _ {1 \cdot ( 2 \dots k ) } ^ {2} $ and $ \sigma _ {1} ^ {2} $ based on a sample of size $ n $. To test the hypothesis of no relationship, the sampling distribution of $ r _ {1 \cdot ( 2 \dots k) } $ is used. Given that the sample is taken from a multivariate normal distribution, the variable $ r _ {1 \cdot ( 2 \dots k ) } ^ {2} $ has the beta-distribution with parameters $ ( ( k - 1 ) / 2 , ( n - k ) / 2 ) $ if $ \rho _ {1 \cdot ( 2 \dots k ) } = 0 $; if $ \rho _ {1 \cdot ( 2 \dots k ) } \neq 0 $, then the distribution of $ r _ {1 \cdot ( 2 \dots k ) } ^ {2} $ is known, but is somewhat complicated.

References

[1] H. Cramér, "Mathematical methods of statistics" , Princeton Univ. Press (1946)
[2] M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 2. Inference and relationship , Griffin (1979)

Comments

For the distribution of $ r _ {1 \cdot ( 2 \dots k ) } ^ {2} $ if $ \rho _ {1 \cdot ( 2 \dots k ) } \neq 0 $ see [a2], Chapt. 10.

References

[a1] T.W. Anderson, "An introduction to multivariate statistical analysis" , Wiley (1958)
[a2] M.L. Eaton, "Multivariate statistics: A vector space approach" , Wiley (1983)
[a3] R.J. Muirhead, "Aspects of multivariate statistical theory" , Wiley (1982)
How to Cite This Entry:
Multiple-correlation coefficient. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Multiple-correlation_coefficient&oldid=47929
This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article