Difference between revisions of "Tie"

Latest revision as of 08:25, 6 June 2020

A group of observations in a sample that have the same value. Let $ X _ {1} \dots X _ {n} $ be independent random variables subject to the same absolutely-continuous probability law with probability density $ p( x) $. Then with probability $ 1 $, none of the observations $ X _ {1} \dots X _ {n} $ will be equal, that is, $ X _ {i} \neq X _ {j} $ if $ i \neq j $, and so every member $ X _ {(} i) $ of the order statistics (cf. Order statistic)

$$ \tag{* } X _ {(} 1) < \dots < X _ {(} n) $$

constructed from the sample $ X _ {1} \dots X _ {n} $ will be strictly greater than its predecessor $ X _ {(} i- 1) $.

However, in practice, because of rounding-off errors in the calculation of $ X _ {1} \dots X _ {n} $, several groups of observations can arise, in each of which the observations are all equal. Every such group of coincident observations is called a tie. Thus, instead of (*), the experimenter may observe the order statistics

$$ X _ {(} 1) = \dots = X _ {( \tau _ {1} ) } < X _ {( \tau _ {1} + 1) } = \dots = X _ {( \tau _ {1} + \tau _ {2} ) } < \dots $$

$$ \dots < X _ {( \tau _ {1} + \dots + \tau _ {k-} 1 + 1 ) } = \dots = X _ {( \tau _ {1} + \dots + \tau _ {k} ) } , $$

where all $ \tau _ {i} \geq 1 $ and $ \tau _ {1} + \dots + \tau _ {k} = n $. Thus, when ties occur, that is, when some $ \tau _ {j} \geq 2 $, difficulties arise in defining the rank vector, which plays a basic role in the construction of rank statistics (cf. Rank statistic). As yet (1992) there are no precise recommendations for defining the ranks of coincident observations. There are two common approaches to the solution of this problem. The first consists of randomization. According to this approach, the ranks of the elements

$$ X _ {( \tau _ {1} + \dots + \tau _ {j-} 1 + 1 ) } = \dots = X _ {( \tau _ {1} + \dots + \tau _ {j} ) } $$

making up the the $ j $- th group are taken to be some permutation of the numbers

$$ \tau _ {1} + \dots + \tau _ {j-} 1 + 1 , \tau _ {1} + \dots + \tau _ {j-} 1 + 2 \dots $$

$$ \dots \tau _ {1} + \dots + \tau _ {j} , $$

each having probability $ 1/ \tau _ {j } ! $. The merit of this approach consists of its simplicity, but for certain alternatives with respect to the distribution of the $ X _ {i} $, the actual randomization chosen has an effect on the results of the statistical analysis.

In the second approach, all tied observations

$$ X _ {( \tau _ {1} + \dots + \tau _ {j-} 1 + 1 ) } = \dots = \ X _ {( \tau _ {1} + \dots + \tau _ {j} ) } $$

making up the $ j $- th group are assigned the same, so-called midrank

$$ \tau _ {j} = \tau _ {1} + \dots + \tau _ {j-} 1 + \frac{\tau _ {j} + 1 }{2} , $$

equal to the arithmetic mean of the numbers

$$ \tau _ {1} + \dots + \tau _ {j-} 1 + 1 , \tau _ {1} + \dots + \tau _ {j-} 1 + 2 \dots $$

$$ \dots \tau _ {1} + \dots + \tau _ {j} . $$

It is natural that such a procedure also affects the properties of rank statistics, and this must be taken into account in practice. For example, the second approach is recommended in the construction of the statistics $ W $ of the Wilcoxon test when there are ties. Then the expectation $ {\mathsf E} W $ of $ W $ remains the same as in the case when there are no ties, but its variance $ {\mathsf D} W $ decreases to

$$ {\mathsf D} = \frac{mn( m+ n- 1) }{12 } \left \{ 1 - \frac{1 }{( m+ n)[( m+ n) ^ {2} - 1 ] } \sum _ { j= } 1 ^ { k } \tau _ {j} ( \tau _ {j} ^ {2} - 1 ) \right \} , $$

and this must be taken into account when normalizing $ W $.

References

[1]	J. Hájek, "Theory of rank tests" , Academia (1967)
[2]	L.N. Bol'shev, N.V. Smirnov, "Tables of mathematical statistics" , Libr. math. tables , 46 , Nauka (1983) (In Russian) (Processed by L.S. Bark and E.S. Kedrova)

How to Cite This Entry:
Tie. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Tie&oldid=15632

This article was adapted from an original article by M.S. Nikulin (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article

Navigation

Tools

Namespaces

Variants

Views

Actions

Difference between revisions of "Tie"

Latest revision as of 08:25, 6 June 2020

References

@@ Line 1: / Line 1: @@
-A group of observations in a sample that have the same value. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928001.png" /> be independent random variables subject to the same absolutely-continuous probability law with probability density <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928002.png" />. Then with probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928003.png" />, none of the observations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928004.png" /> will be equal, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928005.png" /> if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928006.png" />, and so every member <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928007.png" /> of the order statistics (cf. [[Order statistic|Order statistic]])
+<!--
+t0928001.png
+$#A+1 = 33 n = 0
+$#C+1 = 33 : ~/encyclopedia/old_files/data/T092/T.0902800 Tie
+Automatically converted into TeX, above some diagnostics.
+Please remove this comment and the {{TEX|auto}} line below,
+if TeX found to be correct.
+-->
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928008.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+{{TEX|auto}}
+{{TEX|done}}
-constructed from the sample <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t0928009.png" /> will be strictly greater than its predecessor <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280010.png" />.
+A group of observations in a sample that have the same value. Let  $  X _ {1} \dots X _ {n} $
+be independent random variables subject to the same absolutely-continuous probability law with probability density  $  p( x) $.
+Then with probability  $  1 $,
+none of the observations  $  X _ {1} \dots X _ {n} $
+will be equal, that is,  $  X _ {i} \neq X _ {j} $
+if  $  i \neq j $,
+and so every member  $  X _ {(} i) $
+of the order statistics (cf. [[Order statistic|Order statistic]])
-However, in practice, because of rounding-off errors in the calculation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280011.png" />, several groups of observations can arise, in each of which the observations are all equal. Every such group of coincident observations is called a tie. Thus, instead of (*), the experimenter may observe the order statistics
+$$ \tag{* }
+X _ {(} 1)  < \dots <  X _ {(} n)
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280012.png" /></td> </tr></table>
+constructed from the sample  $  X _ {1} \dots X _ {n} $
+will be strictly greater than its predecessor  $  X _ {(} i- 1) $.
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280013.png" /></td> </tr></table>
+However, in practice, because of rounding-off errors in the calculation of  $  X _ {1} \dots X _ {n} $,
+several groups of observations can arise, in each of which the observations are all equal. Every such group of coincident observations is called a tie. Thus, instead of (*), the experimenter may observe the order statistics
-where all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280014.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280015.png" />. Thus, when ties occur, that is, when some <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280016.png" />, difficulties arise in defining the rank vector, which plays a basic role in the construction of rank statistics (cf. [[Rank statistic|Rank statistic]]). As yet (1992) there are no precise recommendations for defining the ranks of coincident observations. There are two common approaches to the solution of this problem. The first consists of randomization. According to this approach, the ranks of the elements
+$$
+X _ {(} 1)  = \dots =  X _ {( \tau _ {1}  ) }  <  X _ {( \tau _ {1}  + 1) }
+ = \dots =  X _ {( \tau _ {1}  + \tau _ {2} ) }  < \dots
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280017.png" /></td> </tr></table>
+$$
+\dots < X _ {( \tau _ {1}  + \dots + \tau _ {k-} 1 + 1 ) }  =
+\dots =  X _ {( \tau _ {1}  + \dots + \tau _ {k} ) } ,
+$$
-making up the the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280018.png" />-th group are taken to be some permutation of the numbers
+where all  $  \tau _ {i} \geq  1 $
+and  $  \tau _ {1} + \dots + \tau _ {k} = n $.
+Thus, when ties occur, that is, when some  $  \tau _ {j} \geq  2 $,
+difficulties arise in defining the rank vector, which plays a basic role in the construction of rank statistics (cf. [[Rank statistic|Rank statistic]]). As yet (1992) there are no precise recommendations for defining the ranks of coincident observations. There are two common approaches to the solution of this problem. The first consists of randomization. According to this approach, the ranks of the elements
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280019.png" /></td> </tr></table>
+$$
+X _ {( \tau _ {1}  + \dots + \tau _ {j-} 1 + 1 ) }  = \dots =  X _ {( \tau _ {1}  + \dots + \tau _ {j} ) }
+$$
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280020.png" /></td> </tr></table>
+making up the the  $  j $-
+th group are taken to be some permutation of the numbers
-each having probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280021.png" />. The merit of this approach consists of its simplicity, but for certain alternatives with respect to the distribution of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280022.png" />, the actual randomization chosen has an effect on the results of the statistical analysis.
+$$
+\tau _ {1} + \dots + \tau _ {j-} 1 + 1 , \tau _ {1} + \dots + \tau _ {j-} 1 + 2 \dots
+$$
+$$
+\dots
+\tau _ {1} + \dots + \tau _ {j} ,
+$$
+each having probability  $  1/ \tau _ {j }  ! $.
+The merit of this approach consists of its simplicity, but for certain alternatives with respect to the distribution of the  $  X _ {i} $,
+the actual randomization chosen has an effect on the results of the statistical analysis.
 In the second approach, all tied observations
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280023.png" /></td> </tr></table>
+$$
+X _ {( \tau _ {1}  + \dots + \tau _ {j-} 1 + 1 ) }  = \dots = \
+X _ {( \tau _ {1}  + \dots + \tau _ {j} ) }
+$$
-making up the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280024.png" />-th group are assigned the same, so-called midrank
+making up the  $  j $-
+th group are assigned the same, so-called midrank
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280025.png" /></td> </tr></table>
+$$
+\tau _ {j}  =  \tau _ {1} + \dots + \tau _ {j-} 1 +
+\frac{\tau _ {j} + 1 }{2}
+ ,
+$$
 equal to the arithmetic mean of the numbers
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280026.png" /></td> </tr></table>
+$$
+\tau _ {1} + \dots + \tau _ {j-} 1 + 1 , \tau _ {1} + \dots + \tau _ {j-} 1 + 2 \dots
+$$
+$$
+\dots
+\tau _ {1} + \dots + \tau _ {j} .
+$$
+It is natural that such a procedure also affects the properties of rank statistics, and this must be taken into account in practice. For example, the second approach is recommended in the construction of the statistics  $  W $
+of the [[Wilcoxon test|Wilcoxon test]] when there are ties. Then the expectation  $  {\mathsf E} W $
+of  $  W $
+remains the same as in the case when there are no ties, but its variance  $  {\mathsf D} W $
+decreases to
+$$
+{\mathsf D}  =
+\frac{mn( m+ n- 1) }{12 }
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280027.png" /></td> </tr></table>
+\left \{ 1 -
-It is natural that such a procedure also affects the properties of rank statistics, and this must be taken into account in practice. For example, the second approach is recommended in the construction of the statistics <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280028.png" /> of the [[Wilcoxon test|Wilcoxon test]] when there are ties. Then the expectation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280029.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280030.png" /> remains the same as in the case when there are no ties, but its variance <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280031.png" /> decreases to
+\frac{1 }{( m+ n)[( m+ n)  ^ {2} - 1 ] }
-<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280032.png" /></td> </tr></table>
+\sum _ { j= } 1 ^ { k }  \tau _ {j} ( \tau _ {j}  ^ {2} - 1 ) \right \} ,
+$$
-and this must be taken into account when normalizing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t092/t092800/t09280033.png" />.
+and this must be taken into account when normalizing  $  W $.
 ====References====
 <table><TR><TD valign="top">[1]</TD> <TD valign="top">  J. Hájek,   "Theory of rank tests" , Academia  (1967)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  L.N. Bol'shev,   N.V. Smirnov,   "Tables of mathematical statistics" , ''Libr. math. tables'' , '''46''' , Nauka  (1983)  (In Russian)  (Processed by L.S. Bark and E.S. Kedrova)</TD></TR></table>