Namespaces
Variants
Actions

Difference between revisions of "Kendall coefficient of rank correlation"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex encoded by computer)
 
Line 1: Line 1:
''Kendall <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552002.png" />''
+
<!--
 +
k0552002.png
 +
$#A+1 = 28 n = 0
 +
$#C+1 = 28 : ~/encyclopedia/old_files/data/K055/K.0505200 Kendall coefficient of rank correlation,
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
  
One of the empirical measures of dependence of two random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552003.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552004.png" /> based on ranking the elements of the sample <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552005.png" />. Thus, the Kendall coefficient is a [[Rank statistic|rank statistic]] and is defined by the formula
+
{{TEX|auto}}
 +
{{TEX|done}}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552006.png" /></td> </tr></table>
+
''Kendall  $  \tau $''
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552007.png" /> is rank of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552008.png" /> belonging to the pair <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k0552009.png" /> for which the rank of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520010.png" /> is equal to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520011.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520012.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520013.png" /> being the number of elements of the sample for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520014.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520015.png" /> simultaneously. The inequality <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520016.png" /> always holds. The Kendall coefficient of rank correlation has been extensively used (see [[#References|[1]]]) as an empirical measure of dependence.
+
One of the empirical measures of dependence of two random variables  $  X $
 +
and  $  Y $
 +
based on ranking the elements of the sample $  ( X _ {1} , Y _ {1} ) \dots ( X _ {n} , Y _ {n} ) $.  
 +
Thus, the Kendall coefficient is a [[Rank statistic|rank statistic]] and is defined by the formula
  
The Kendall coefficient of rank correlation is applied for testing hypotheses of independence of random variables. If the hypothesis of independence is true, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520017.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520018.png" />. For small samples <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520019.png" /> statistical testing of hypotheses of independence is carried out by means of special tables (see [[#References|[3]]]). When <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520020.png" /> the normal approximation for the distribution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520021.png" /> is used: If
+
$$
 +
\tau  = \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520022.png" /></td> </tr></table>
+
\frac{2 S ( r _ {1} \dots r _ {n} ) }{n ( n - 1 ) }
 +
,
 +
$$
  
then the hypothesis of independence is rejected and the alternative is accepted. Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520023.png" /> is the significance level, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520024.png" /> is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520025.png" />-percent point of the normal distribution. The Kendall coefficient of rank correlation can be used for revealing dependence of two qualitative characteristics, provided that the elements of the sample can be ordered with respect to these characteristics. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520026.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520027.png" /> have a joint normal distribution with correlation coefficient <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520028.png" />, then its relation to the Kendall coefficient of rank correlation has the form
+
where  $  r _ {i} $
 +
is rank of  $  Y $
 +
belonging to the pair  $  ( X , Y ) $
 +
for which the rank of $  X $
 +
is equal to  $  i $,
 +
$  S = 2 N - n ( n - 1 ) / 2 $,  
 +
$  N $
 +
being the number of elements of the sample for which  $  j > i $
 +
and  $  r _ {j} > r _ {i} $
 +
simultaneously. The inequality  $  - 1 \leq  \tau \leq  1 $
 +
always holds. The Kendall coefficient of rank correlation has been extensively used (see [[#References|[1]]]) as an empirical measure of dependence.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/k/k055/k055200/k05520029.png" /></td> </tr></table>
+
The Kendall coefficient of rank correlation is applied for testing hypotheses of independence of random variables. If the hypothesis of independence is true, then  $  {\mathsf E} \tau = 0 $
 +
and  $  D \tau = 2 ( 2 n + 5 ) / 9 n ( n - 1 ) $.
 +
For small samples  $  ( 4 \leq  n \leq  10 ) $
 +
statistical testing of hypotheses of independence is carried out by means of special tables (see [[#References|[3]]]). When  $  n > 10 $
 +
the normal approximation for the distribution of  $  \tau $
 +
is used: If
 +
 
 +
$$
 +
| \tau |  > u _ {\alpha / 2 }
 +
\sqrt {
 +
 
 +
\frac{2 ( 2 n + 5 ) }{9 n ( n - 1 ) }
 +
} ,
 +
$$
 +
 
 +
then the hypothesis of independence is rejected and the alternative is accepted. Here  $  \alpha $
 +
is the significance level, and  $  u _ {\alpha / 2 }  $
 +
is the  $  100 \cdot ( \alpha / 2 ) $-
 +
percent point of the normal distribution. The Kendall coefficient of rank correlation can be used for revealing dependence of two qualitative characteristics, provided that the elements of the sample can be ordered with respect to these characteristics. If  $  X $,
 +
$  Y $
 +
have a joint normal distribution with correlation coefficient  $  \rho $,
 +
then its relation to the Kendall coefficient of rank correlation has the form
 +
 
 +
$$
 +
{\mathsf E} \tau  = 
 +
\frac{2} \pi
 +
  { \mathop{\rm arc}  \sin }  \rho .
 +
$$
  
 
See also [[Spearman coefficient of rank correlation|Spearman coefficient of rank correlation]]; [[Rank test|Rank test]].
 
See also [[Spearman coefficient of rank correlation|Spearman coefficient of rank correlation]]; [[Rank test|Rank test]].

Latest revision as of 22:14, 5 June 2020


Kendall $ \tau $

One of the empirical measures of dependence of two random variables $ X $ and $ Y $ based on ranking the elements of the sample $ ( X _ {1} , Y _ {1} ) \dots ( X _ {n} , Y _ {n} ) $. Thus, the Kendall coefficient is a rank statistic and is defined by the formula

$$ \tau = \ \frac{2 S ( r _ {1} \dots r _ {n} ) }{n ( n - 1 ) } , $$

where $ r _ {i} $ is rank of $ Y $ belonging to the pair $ ( X , Y ) $ for which the rank of $ X $ is equal to $ i $, $ S = 2 N - n ( n - 1 ) / 2 $, $ N $ being the number of elements of the sample for which $ j > i $ and $ r _ {j} > r _ {i} $ simultaneously. The inequality $ - 1 \leq \tau \leq 1 $ always holds. The Kendall coefficient of rank correlation has been extensively used (see [1]) as an empirical measure of dependence.

The Kendall coefficient of rank correlation is applied for testing hypotheses of independence of random variables. If the hypothesis of independence is true, then $ {\mathsf E} \tau = 0 $ and $ D \tau = 2 ( 2 n + 5 ) / 9 n ( n - 1 ) $. For small samples $ ( 4 \leq n \leq 10 ) $ statistical testing of hypotheses of independence is carried out by means of special tables (see [3]). When $ n > 10 $ the normal approximation for the distribution of $ \tau $ is used: If

$$ | \tau | > u _ {\alpha / 2 } \sqrt { \frac{2 ( 2 n + 5 ) }{9 n ( n - 1 ) } } , $$

then the hypothesis of independence is rejected and the alternative is accepted. Here $ \alpha $ is the significance level, and $ u _ {\alpha / 2 } $ is the $ 100 \cdot ( \alpha / 2 ) $- percent point of the normal distribution. The Kendall coefficient of rank correlation can be used for revealing dependence of two qualitative characteristics, provided that the elements of the sample can be ordered with respect to these characteristics. If $ X $, $ Y $ have a joint normal distribution with correlation coefficient $ \rho $, then its relation to the Kendall coefficient of rank correlation has the form

$$ {\mathsf E} \tau = \frac{2} \pi { \mathop{\rm arc} \sin } \rho . $$

See also Spearman coefficient of rank correlation; Rank test.

References

[1] M.G. Kendall, "Rank correlation methods" , Griffin (1970)
[2] B.L. van der Waerden, "Mathematische Statistik" , Springer (1957)
[3] L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)
[4] E.S. Pearson, H.O. Hartley, "Biometrica tables for statisticians" , 1 , Cambridge Univ. Press (1956)
How to Cite This Entry:
Kendall coefficient of rank correlation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Kendall_coefficient_of_rank_correlation&oldid=13189
This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article