Namespaces
Variants
Actions

Difference between revisions of "Mahalanobis distance"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
(latex details)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
<!--
 +
m0621301.png
 +
$#A+1 = 12 n = 0
 +
$#C+1 = 12 : ~/encyclopedia/old_files/data/M062/M.0602130 Mahalanobis distance
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
The quantity
 
The quantity
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621301.png" /></td> </tr></table>
+
$$
 +
\rho ( X, Y  \mid  A)  = \{ ( X- Y)  ^ {T} A( X- Y) \}  ^ {1/2} ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621302.png" /> are vectors and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621303.png" /> is a matrix (and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621304.png" /> denotes transposition). The Mahalanobis distance is used in [[Multi-dimensional statistical analysis|multi-dimensional statistical analysis]]; in particular, for testing hypotheses and the classification of observations. It was introduced by P. Mahalanobis [[#References|[1]]], who used the quantity
+
where $  X, Y $
 +
are vectors and $  A $
 +
is a matrix (and $  {}  ^ {T} $
 +
denotes transposition). The Mahalanobis distance is used in [[Multi-dimensional statistical analysis|multi-dimensional statistical analysis]]; in particular, for testing hypotheses and the classification of observations. It was introduced by P. Mahalanobis [[#References|[1]]], who used the quantity
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621305.png" /></td> </tr></table>
+
$$
 +
\rho ( \mu _ {1} , \mu _ {2}  \mid  \Sigma  ^ {-1} )
 +
$$
  
as a distance between two normal distributions with expectations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621306.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621307.png" /> and common covariance matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621308.png" />. The Mahalanobis distance between two samples (from distributions with identical covariance matrices), or between a sample and a distribution, is defined by replacing the corresponding theoretical moments by sampling moments. As an estimate of the Mahalanobis distance between two distributions one uses the Mahalanobis distance between the samples extracted from these distributions or, in the case [[#References|[5]]] where a linear discriminant function is utilized — the statistic <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m0621309.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m06213010.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m06213011.png" /> are the frequencies of correct classification in the first and the second collection, respectively, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/m/m062/m062130/m06213012.png" /> is the normal distribution function with expectation 0 and variance 1.
+
as a distance between two normal distributions with expectations $  \mu _ {1} $
 +
and $  \mu _ {2} $
 +
and common covariance matrix $  \Sigma $.  
 +
The Mahalanobis distance between two samples (from distributions with identical covariance matrices), or between a sample and a distribution, is defined by replacing the corresponding theoretical moments by sampling moments. As an estimate of the Mahalanobis distance between two distributions one uses the Mahalanobis distance between the samples extracted from these distributions or, in the case [[#References|[5]]] where a linear discriminant function is utilized — the statistic $  \Phi  ^ {-1} ( \alpha ) + \Phi  ^ {-1} ( \beta ) $,  
 +
where $  \alpha $
 +
and $  \beta $
 +
are the frequencies of correct classification in the first and the second collection, respectively, and $  \Phi $
 +
is the normal distribution function with expectation 0 and variance 1.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  P. Mahalanobis,  "On tests and measures of group divergence I. Theoretical formulae"  ''J. and Proc. Asiat. Soc. of Bengal'' , '''26'''  (1930)  pp. 541–588</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  P. Mahalanobis,  "On the generalized distance in statistics"  ''Proc. Nat. Inst. Sci. India (Calcutta)'' , '''2'''  (1936)  pp. 49–55</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  T.W. Anderson,  "Introduction to multivariate statistical analysis" , Wiley  (1958)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  S.A. Aivazyan,  Z.I. Bezhaeva,  O.V. Staroverov,  "Classifying multivariate observations" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.I. Orlov,  "On the comparison of algorithms for classifying by results observations of actual data"  ''Dokl. Moskov. Obshch. Isp. Prirod. 1985, Otdel. Biol.''  (1987)  pp. 79–82  (In Russian)</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  P. Mahalanobis,  "On tests and measures of group divergence I. Theoretical formulae"  ''J. and Proc. Asiat. Soc. of Bengal'' , '''26'''  (1930)  pp. 541–588</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  P. Mahalanobis,  "On the generalized distance in statistics"  ''Proc. Nat. Inst. Sci. India (Calcutta)'' , '''2'''  (1936)  pp. 49–55</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  T.W. Anderson,  "Introduction to multivariate statistical analysis" , Wiley  (1958)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  S.A. Aivazyan,  Z.I. Bezhaeva,  O.V. Staroverov,  "Classifying multivariate observations" , Moscow  (1974)  (In Russian)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.I. Orlov,  "On the comparison of algorithms for classifying by results observations of actual data"  ''Dokl. Moskov. Obshch. Isp. Prirod. 1985, Otdel. Biol.''  (1987)  pp. 79–82  (In Russian)</TD></TR></table>

Latest revision as of 20:32, 16 January 2024


The quantity

$$ \rho ( X, Y \mid A) = \{ ( X- Y) ^ {T} A( X- Y) \} ^ {1/2} , $$

where $ X, Y $ are vectors and $ A $ is a matrix (and $ {} ^ {T} $ denotes transposition). The Mahalanobis distance is used in multi-dimensional statistical analysis; in particular, for testing hypotheses and the classification of observations. It was introduced by P. Mahalanobis [1], who used the quantity

$$ \rho ( \mu _ {1} , \mu _ {2} \mid \Sigma ^ {-1} ) $$

as a distance between two normal distributions with expectations $ \mu _ {1} $ and $ \mu _ {2} $ and common covariance matrix $ \Sigma $. The Mahalanobis distance between two samples (from distributions with identical covariance matrices), or between a sample and a distribution, is defined by replacing the corresponding theoretical moments by sampling moments. As an estimate of the Mahalanobis distance between two distributions one uses the Mahalanobis distance between the samples extracted from these distributions or, in the case [5] where a linear discriminant function is utilized — the statistic $ \Phi ^ {-1} ( \alpha ) + \Phi ^ {-1} ( \beta ) $, where $ \alpha $ and $ \beta $ are the frequencies of correct classification in the first and the second collection, respectively, and $ \Phi $ is the normal distribution function with expectation 0 and variance 1.

References

[1] P. Mahalanobis, "On tests and measures of group divergence I. Theoretical formulae" J. and Proc. Asiat. Soc. of Bengal , 26 (1930) pp. 541–588
[2] P. Mahalanobis, "On the generalized distance in statistics" Proc. Nat. Inst. Sci. India (Calcutta) , 2 (1936) pp. 49–55
[3] T.W. Anderson, "Introduction to multivariate statistical analysis" , Wiley (1958)
[4] S.A. Aivazyan, Z.I. Bezhaeva, O.V. Staroverov, "Classifying multivariate observations" , Moscow (1974) (In Russian)
[5] A.I. Orlov, "On the comparison of algorithms for classifying by results observations of actual data" Dokl. Moskov. Obshch. Isp. Prirod. 1985, Otdel. Biol. (1987) pp. 79–82 (In Russian)
How to Cite This Entry:
Mahalanobis distance. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Mahalanobis_distance&oldid=17720
This article was adapted from an original article by A.I. Orlov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article