Namespaces
Variants
Actions

Difference between revisions of "Spearman coefficient of rank correlation"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (tex encoded by computer)
(latex details)
 
Line 27: Line 27:
 
\frac{12}{n ( n  ^ {2} - 1 ) }
 
\frac{12}{n ( n  ^ {2} - 1 ) }
  
\sum _ { i= } 1 ^ { n }  
+
\sum _ { i=1}^ { n }  
 
\left ( i - n+  
 
\left ( i - n+  
 
\frac{1}{2}
 
\frac{1}{2}
Line 41: Line 41:
 
r _ {s}  =  1 -  
 
r _ {s}  =  1 -  
 
\frac{6 }{n ( n  ^ {2} - 1 ) }
 
\frac{6 }{n ( n  ^ {2} - 1 ) }
  \sum _ { i= } 1 ^ { n }  d _ {i}  ^ {2} ,
+
  \sum _ {i=1} ^ { n }  d _ {i}  ^ {2} ,
 
$$
 
$$
  
Line 56: Line 56:
 
when the rank sequences are completely opposite, i.e.  $  i = ( n + 1 ) - R _ {i} $,  
 
when the rank sequences are completely opposite, i.e.  $  i = ( n + 1 ) - R _ {i} $,  
 
$  i = 1 \dots n $.  
 
$  i = 1 \dots n $.  
This coefficient, like any other [[Rank statistic|rank statistic]], is applied to test the hypothesis of independence of two variables. If the variables are independent, then  $  {\mathsf E} r _ {s} = 0 $,  
+
This coefficient, like any other [[rank statistic]], is applied to test the hypothesis of independence of two variables. If the variables are independent, then  $  {\mathsf E} r _ {s} = 0 $,  
 
and  $  {\mathsf D} r _ {s} = 1 / ( n - 1 ) $.  
 
and  $  {\mathsf D} r _ {s} = 1 / ( n - 1 ) $.  
 
Thus, the amount of deviation of  $  r _ {s} $
 
Thus, the amount of deviation of  $  r _ {s} $
Line 70: Line 70:
 
is the root of the equation  $  \Phi ( u) = 1 - \alpha / 2 $
 
is the root of the equation  $  \Phi ( u) = 1 - \alpha / 2 $
 
and  $  \Phi ( u) $
 
and  $  \Phi ( u) $
is the standard [[Normal distribution|normal distribution]] function.
+
is the standard [[normal distribution]] function.
  
 
Under the assumption that  $  X $
 
Under the assumption that  $  X $
 
and  $  Y $
 
and  $  Y $
have a joint normal distribution with (ordinary) [[Correlation coefficient|correlation coefficient]]  $  \rho $,
+
have a joint normal distribution with (ordinary) [[correlation coefficient]]  $  \rho $,
  
 
$$  
 
$$  
Line 92: Line 92:
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  C. Spearman,  "The proof and measurement of association between two rings"  ''Amer. J. Psychol.'' , '''15'''  (1904)  pp. 72–101</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  M.G. Kendall,  "Rank correlation methods" , Griffin  (1962)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR></table>
+
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  C. Spearman,  "The proof and measurement of association between two rings"  ''Amer. J. Psychol.'' , '''15'''  (1904)  pp. 72–101</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  M.G. Kendall,  "Rank correlation methods" , Griffin  (1962)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  B.L. van der Waerden,  "Mathematische Statistik" , Springer  (1957)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  L.N. Bol'shev,  N.V. Smirnov,  "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR>
 
+
<TR><TD valign="top">[a1]</TD> <TD valign="top">  J. Hájek,  Z. Sidák,  "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  M. Hollander,  D.A. Wolfe,  "Nonparametric statistical methods" , Wiley  (1973)</TD></TR></table>
====Comments====
 
 
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  J. Hájek,  Z. Sidák,  "Theory of rank tests" , Acad. Press  (1967)</TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top">  M. Hollander,  D.A. Wolfe,  "Nonparametric statistical methods" , Wiley  (1973)</TD></TR></table>
 

Latest revision as of 09:15, 6 January 2024


A measure of the dependence of two random variables $ X $ and $ Y $, based on the rankings of the $ X _ {i} $' s and $ Y _ {i} $' s in independent pairs of observations $ ( X _ {1} , Y _ {1} ) \dots ( X _ {n} , Y _ {n} ) $. If $ R _ {i} $ is the rank of $ Y $ corresponding to that pair $ ( X , Y ) $ for which the rank of $ X $ is equal to $ i $, then the Spearman coefficient of rank correlation is defined by the formula

$$ r _ {s} = \frac{12}{n ( n ^ {2} - 1 ) } \sum _ { i=1}^ { n } \left ( i - n+ \frac{1}{2} \right ) \left ( R _ {i} - n+ \frac{1}{2} \right ) $$

or, equivalently, by

$$ r _ {s} = 1 - \frac{6 }{n ( n ^ {2} - 1 ) } \sum _ {i=1} ^ { n } d _ {i} ^ {2} , $$

where $ d _ {i} $ is the difference between the ranks of $ X _ {i} $ and $ Y _ {i} $. The value of $ r _ {s} $ lies between $ - 1 $ and $ + 1 $; $ r _ {s} = + 1 $ when the rank sequences completely coincide, i.e. $ i = R _ {i} $, $ i = 1 \dots n $; and $ r _ {s} = - 1 $ when the rank sequences are completely opposite, i.e. $ i = ( n + 1 ) - R _ {i} $, $ i = 1 \dots n $. This coefficient, like any other rank statistic, is applied to test the hypothesis of independence of two variables. If the variables are independent, then $ {\mathsf E} r _ {s} = 0 $, and $ {\mathsf D} r _ {s} = 1 / ( n - 1 ) $. Thus, the amount of deviation of $ r _ {s} $ from zero gives information about the dependence or independence of the variables. To construct the corresponding test one computes the distribution of $ r _ {s} $ for independent variables $ X $ and $ Y $. When $ 4 \leq n \leq 10 $ one can use tables of the exact distribution (see [2], [4]), and when $ n > 10 $ one can take advantage, for example, of the fact that as $ n \rightarrow \infty $ the random variable $ \sqrt n- 1 r _ {s} $ is asymptotically distributed as a standard normal distribution. In the latter case the hypothesis of independence is rejected if $ | r _ {s} | > u _ {1 - \alpha / 2 } / \sqrt n- 1 $, where $ u _ {1 - \alpha / 2 } $ is the root of the equation $ \Phi ( u) = 1 - \alpha / 2 $ and $ \Phi ( u) $ is the standard normal distribution function.

Under the assumption that $ X $ and $ Y $ have a joint normal distribution with (ordinary) correlation coefficient $ \rho $,

$$ {\mathsf E} r _ {s} \sim \frac{6} \pi { \mathop{\rm arc} \sin } \frac \rho {2} $$

as $ n \rightarrow \infty $, and therefore the variable $ 2 \sin ( \pi r _ {s} / 6 ) $ can be used as an estimator for $ \rho $.

The Spearman coefficient of rank correlation was named in honour of the psychologist C. Spearman (1904), who used it in research on psychology in place of the ordinary correlation coefficient. The tests based on the Spearman coefficient of rank correlation and on the Kendall coefficient of rank correlation are asymptotically equivalent (when $ n = 2 $, the corresponding rank statistics coincide).

References

[1] C. Spearman, "The proof and measurement of association between two rings" Amer. J. Psychol. , 15 (1904) pp. 72–101
[2] M.G. Kendall, "Rank correlation methods" , Griffin (1962)
[3] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)
[4] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[a1] J. Hájek, Z. Sidák, "Theory of rank tests" , Acad. Press (1967)
[a2] M. Hollander, D.A. Wolfe, "Nonparametric statistical methods" , Wiley (1973)
How to Cite This Entry:
Spearman coefficient of rank correlation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Spearman_coefficient_of_rank_correlation&oldid=54862
This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article