Difference between revisions of "Kendall coefficient of rank correlation"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
Line 1: | Line 1: | ||
− | + | <!-- | |
+ | k0552002.png | ||
+ | $#A+1 = 28 n = 0 | ||
+ | $#C+1 = 28 : ~/encyclopedia/old_files/data/K055/K.0505200 Kendall coefficient of rank correlation, | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
− | + | {{TEX|auto}} | |
+ | {{TEX|done}} | ||
− | + | ''Kendall $ \tau $'' | |
− | + | One of the empirical measures of dependence of two random variables $ X $ | |
+ | and $ Y $ | ||
+ | based on ranking the elements of the sample $ ( X _ {1} , Y _ {1} ) \dots ( X _ {n} , Y _ {n} ) $. | ||
+ | Thus, the Kendall coefficient is a [[Rank statistic|rank statistic]] and is defined by the formula | ||
− | + | $$ | |
+ | \tau = \ | ||
− | + | \frac{2 S ( r _ {1} \dots r _ {n} ) }{n ( n - 1 ) } | |
+ | , | ||
+ | $$ | ||
− | + | where $ r _ {i} $ | |
+ | is rank of $ Y $ | ||
+ | belonging to the pair $ ( X , Y ) $ | ||
+ | for which the rank of $ X $ | ||
+ | is equal to $ i $, | ||
+ | $ S = 2 N - n ( n - 1 ) / 2 $, | ||
+ | $ N $ | ||
+ | being the number of elements of the sample for which $ j > i $ | ||
+ | and $ r _ {j} > r _ {i} $ | ||
+ | simultaneously. The inequality $ - 1 \leq \tau \leq 1 $ | ||
+ | always holds. The Kendall coefficient of rank correlation has been extensively used (see [[#References|[1]]]) as an empirical measure of dependence. | ||
− | + | The Kendall coefficient of rank correlation is applied for testing hypotheses of independence of random variables. If the hypothesis of independence is true, then $ {\mathsf E} \tau = 0 $ | |
+ | and $ D \tau = 2 ( 2 n + 5 ) / 9 n ( n - 1 ) $. | ||
+ | For small samples $ ( 4 \leq n \leq 10 ) $ | ||
+ | statistical testing of hypotheses of independence is carried out by means of special tables (see [[#References|[3]]]). When $ n > 10 $ | ||
+ | the normal approximation for the distribution of $ \tau $ | ||
+ | is used: If | ||
+ | |||
+ | $$ | ||
+ | | \tau | > u _ {\alpha / 2 } | ||
+ | \sqrt { | ||
+ | |||
+ | \frac{2 ( 2 n + 5 ) }{9 n ( n - 1 ) } | ||
+ | } , | ||
+ | $$ | ||
+ | |||
+ | then the hypothesis of independence is rejected and the alternative is accepted. Here $ \alpha $ | ||
+ | is the significance level, and $ u _ {\alpha / 2 } $ | ||
+ | is the $ 100 \cdot ( \alpha / 2 ) $- | ||
+ | percent point of the normal distribution. The Kendall coefficient of rank correlation can be used for revealing dependence of two qualitative characteristics, provided that the elements of the sample can be ordered with respect to these characteristics. If $ X $, | ||
+ | $ Y $ | ||
+ | have a joint normal distribution with correlation coefficient $ \rho $, | ||
+ | then its relation to the Kendall coefficient of rank correlation has the form | ||
+ | |||
+ | $$ | ||
+ | {\mathsf E} \tau = | ||
+ | \frac{2} \pi | ||
+ | { \mathop{\rm arc} \sin } \rho . | ||
+ | $$ | ||
See also [[Spearman coefficient of rank correlation|Spearman coefficient of rank correlation]]; [[Rank test|Rank test]]. | See also [[Spearman coefficient of rank correlation|Spearman coefficient of rank correlation]]; [[Rank test|Rank test]]. |
Latest revision as of 22:14, 5 June 2020
Kendall $ \tau $
One of the empirical measures of dependence of two random variables $ X $ and $ Y $ based on ranking the elements of the sample $ ( X _ {1} , Y _ {1} ) \dots ( X _ {n} , Y _ {n} ) $. Thus, the Kendall coefficient is a rank statistic and is defined by the formula
$$ \tau = \ \frac{2 S ( r _ {1} \dots r _ {n} ) }{n ( n - 1 ) } , $$
where $ r _ {i} $ is rank of $ Y $ belonging to the pair $ ( X , Y ) $ for which the rank of $ X $ is equal to $ i $, $ S = 2 N - n ( n - 1 ) / 2 $, $ N $ being the number of elements of the sample for which $ j > i $ and $ r _ {j} > r _ {i} $ simultaneously. The inequality $ - 1 \leq \tau \leq 1 $ always holds. The Kendall coefficient of rank correlation has been extensively used (see [1]) as an empirical measure of dependence.
The Kendall coefficient of rank correlation is applied for testing hypotheses of independence of random variables. If the hypothesis of independence is true, then $ {\mathsf E} \tau = 0 $ and $ D \tau = 2 ( 2 n + 5 ) / 9 n ( n - 1 ) $. For small samples $ ( 4 \leq n \leq 10 ) $ statistical testing of hypotheses of independence is carried out by means of special tables (see [3]). When $ n > 10 $ the normal approximation for the distribution of $ \tau $ is used: If
$$ | \tau | > u _ {\alpha / 2 } \sqrt { \frac{2 ( 2 n + 5 ) }{9 n ( n - 1 ) } } , $$
then the hypothesis of independence is rejected and the alternative is accepted. Here $ \alpha $ is the significance level, and $ u _ {\alpha / 2 } $ is the $ 100 \cdot ( \alpha / 2 ) $- percent point of the normal distribution. The Kendall coefficient of rank correlation can be used for revealing dependence of two qualitative characteristics, provided that the elements of the sample can be ordered with respect to these characteristics. If $ X $, $ Y $ have a joint normal distribution with correlation coefficient $ \rho $, then its relation to the Kendall coefficient of rank correlation has the form
$$ {\mathsf E} \tau = \frac{2} \pi { \mathop{\rm arc} \sin } \rho . $$
See also Spearman coefficient of rank correlation; Rank test.
References
[1] | M.G. Kendall, "Rank correlation methods" , Griffin (1970) |
[2] | B.L. van der Waerden, "Mathematische Statistik" , Springer (1957) |
[3] | L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova) |
[4] | E.S. Pearson, H.O. Hartley, "Biometrica tables for statisticians" , 1 , Cambridge Univ. Press (1956) |
Kendall coefficient of rank correlation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Kendall_coefficient_of_rank_correlation&oldid=47486