M-estimator
A generalization of the maximum-likelihood estimator (MLE) in mathematical statistics (cf. also Maximum-likelihood method; Statistical estimator). Suppose one has univariate observations which are independent and identically distributed according to a distribution
with univariate parameter
. Denote by
the likelihood of
. The maximum-likelihood estimator is defined as the value
which maximizes
. If
for all
and
, then this is equivalent to minimizing
. P.J. Huber [a1] has generalized this to M-estimators, which are defined by minimizing
, where
is an arbitrary real function. When
has a partial derivative
, then
satisfies the implicit equation
![]() |
Note that the maximum-likelihood estimator is an M-estimator, obtained by putting .
The maximum-likelihood estimator can give arbitrarily bad results when the underlying assumptions (e.g., the form of the distribution generating the data) are not satisfied (e.g., because the data contain some outliers, cf. also Outlier). M-estimators are particularly useful in robust statistics, which aims to construct methods that are relatively insensitive to deviations from the standard assumptions. M-estimators with bounded are typically robust.
Apart from the finite-sample version of the M-estimator, there is also a functional version
defined for any probability distribution
by
![]() |
Here, it is assumed that is Fisher-consistent, i.e. that
for all
. The influence function of a functional
in
is defined, as in [a2], by
![]() |
where is the probability distribution which puts all its mass in the point
. Therefore
describes the effect of a single outlier in
on the estimator
. For an M-estimator
at
,
![]() |
The influence function of an M-estimator is thus proportional to itself. Under suitable conditions, [a3], M-estimators are asymptotically normal with asymptotic variance
.
Optimal robust M-estimators can be obtained by solving Huber's minimax variance problem [a1] or by minimizing the asymptotic variance subject to an upper bound on the gross-error sensitivity
as in [a2].
When estimating a univariate location, it is natural to use -functions of the type
. The optimal robust M-estimator for univariate location at the Gaussian location model
(cf. also Gauss law) is given by
. This
has come to be known as Huber's function. Note that when
, this M-estimator tends to the median (cf. also Median (in statistics)), and when
it tends to the mean (cf. also Average).
The breakdown value of an estimator
is the largest fraction of arbitrary outliers it can tolerate without becoming unbounded (see [a2]). Any M-estimator with a monotone and bounded
function has breakdown value
, the highest possible value.
Location M-estimators are not invariant with respect to scale. Therefore it is recommended to compute from
![]() | (a1) |
where is a robust estimator of scale, e.g. the median absolute deviation
![]() |
which has .
For univariate scale estimation one uses -functions of the type
. At the Gaussian scale model
, the optimal robust M-estimators are given by
. For
one obtains the median absolute deviation and for
the standard deviation. In the general case, where both location and scale are unknown, one first computes
and then plugs it into (a1) for finding
.
For multivariate location and scatter matrices, M-estimators were defined by R.A. Maronna [a4], who also gave their influence function and asymptotic covariance matrix. For -dimensional data, the breakdown value of M-estimators is at most
.
For regression analysis, one considers the linear model where
and
are column vectors, and
and the error term
are independent. Let
have a distribution with location zero and scale
. For simplicity, put
. Denote by
the joint distribution of
, which implies the distribution of the error term
. Based on a data set
, M-estimators
for regression [a3] are defined by
![]() |
where are the residuals. If the Huber function
is used, the influence function of
at
equals
![]() | (a2) |
where . The first factor of (a2) is the influence of the vertical error
. It is bounded, which makes this estimator more robust than least squares (cf. also Least squares, method of). The second factor is the influence of the position
. Unfortunately, this factor is unbounded, hence a single outlying
(i.e., a horizontal outlier) will almost completely determine the fit, as shown in [a2]. Therefore the breakdown value
.
To obtain a bounded influence function, generalized M-estimators [a2] are defined by
![]() |
for some real function . The influence function of
at
now becomes
![]() | (a3) |
where and
. For an appropriate choice of the function
, the influence function (a3) is bounded, but still the breakdown value
goes down to zero when the number of parameters
increases.
To repair this, P.J. Rousseeuw and V.J. Yohai [a5] have introduced S-estimators. An S-estimator minimizes
, where
are the residuals and
is the robust scale estimator defined as the solution of
![]() |
where is taken to be
. The function
must satisfy
and
and be continuously differentiable, and there must be a constant
such that
is strictly increasing on
and constant on
. Any S-estimator has breakdown value
in all dimensions, and it is asymptotically normal with the same asymptotic covariance as the M-estimator with that function
. The S-estimators have also been generalized to multivariate location and scatter matrices, in [a6], and they enjoy the same properties.
References
[a1] | P.J. Huber, "Robust estimation of a location parameter" Ann. Math. Stat. , 35 (1964) pp. 73–101 |
[a2] | F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, W.A. Stahel, "Robust statistics: The approach based on influence functions" , Wiley (1986) |
[a3] | P.J. Huber, "Robust statistics" , Wiley (1981) |
[a4] | R.A. Maronna, "Robust M-estimators of multivariate location and scatter" Ann. Statist. , 4 (1976) pp. 51–67 |
[a5] | P.J. Rousseeuw, V.J. Yohai, "Robust regression by means of S-estimators" J. Franke (ed.) W. Härdle (ed.) R.D. Martin (ed.) , Robust and Nonlinear Time Ser. Analysis , Lecture Notes Statistics , 26 , Springer (1984) pp. 256–272 |
[a6] | P.J. Rousseeuw, A. Leroy, "Robust regression and outlier detection" , Wiley (1987) |
M-estimator. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=M-estimator&oldid=12651