%%% Title of object: Least Absolute Residuals Procedure
%%% Canonical Name: LeastAbsoluteResidualsProcedure
%%% Type: Topic
%%% Created on: 2010-09-21 20:59:07
%%% Modified on: 2011-06-24 16:11:35
%%% Creator: Bill Farebrother
%%% Modifier: misha123
%%% Author: misha123
%%% Author: Bill Farebrother
%%%
%%% Classification: msc:01A50
%%% Keywords: Rogerius Josephus Boscovich; Double exponential distribution; Estimation procedure; History; LÃÂÃÂ¢ÃÂÃÂÃÂÃÂ-norm; Least absolute deviations procedure;
%%% Preamble:
\documentclass[10pt]{article}
% this is the default PlanetMath preamble. as your knowledge
% of TeX increases, you will probably want to edit this, but
% it should be fine as is for beginners.
% almost certainly you want these
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}
% used for TeXing text within eps files
%\usepackage{psfrag}
% need this for including graphics (\includegraphics)
%\usepackage{graphicx}
% for neatly defining theorems and propositions
%\usepackage{amsthm}
% making logically defined graphics
%\usepackage{xypic}
% there are many more packages, add them here as you need them
% define commands here
%%%% Content:
\begin{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%TCIDATA{OutputFilter=LATEX.DLL}
%TCIDATA{Created=Wed Jun 10 22:22:05 2009}
%TCIDATA{LastRevised=Fri Aug 27 12:24:10 2010}
%TCIDATA{}
%TCIDATA{}
%TCIDATA{Language=American English}
%TCIDATA{CSTFile=LaTeX article (bright).cst}
\vspace{0.5cm}
\textit{Summary} some fifty years before the least sum of squared residuals
fitting procedure was published in 1805, Boscovich (or Bo\v{s}kovi\'{c}) proposed an
alternative which minimises the (constrained) sum of
the absolute residuals.
\vspace{1cm}
For $i=1,2,...,n$, let $\{x_{i1},x_{i2},...,x_{iq},y_{i}\}$ represent the $i$%
th observation on a set of $q+1$ variables and suppose that we wish to fit a
linear model of the form
\begin{equation*}
y_i = x_{i1}\beta_1 + x_{i2}\beta_2 + ... + x_{iq}\beta_q + \epsilon_i
\end{equation*}
\noindent to these $n$ observations. Then, for $p > 0$, the $L_p$-norm
fitting procedure chooses values for $b_1, b_2, ..., b_q$ to minimise the $%
L_p$-norm of the residuals $[\sum_{i=1}^n |e_i|^p]^{1/p}$ where, for $i = 1,
2, ..., n$, the $i$th residual is defined by
\begin{equation*}
e_i = y_i - x_{i1}b_1 - x_{i2}b_2 - ... - x_{iq}b_q.
\end{equation*}
% \vspace{0.1cm}
The most familiar $L_p$-norm fitting procedure, known as the least squares
procedure, sets $p=2$ and chooses values for $b_1, b_2, ..., b_q$ to
minimise the sum of the squared residuals $\sum_{i=1}^n e_i^2$.
% \vspace{0.1cm}
A second choice, to be discussed in the present article, sets $p=1$ and
chooses $b_1, b_2, ..., b_q$ to minimise the sum of the absolute residuals $%
\sum_{i=1}^n |e_i|$
% \vspace{0.1cm}
A third choice sets $p=\infty $ and chooses $b_{1},b_{2},...,b_{q}$ to
minimise the largest absolute residual $max_{i=1}^{n}|e_{i}|$.
% \vspace{0.1cm}
Setting $u_i = e_i$ and $v_i = 0$ if $e_i \geq 0$ and $u_i = 0$ and $v_i =
-e_i$ if $e_i < 0$, we find that $e_i = u_i - v_i$ so that the least
absolute residuals ($LAR$) fitting problem chooses $b_1, b_2, ..., b_q$ to
minimise the sum of the absolute residuals
\begin{equation*}
\sum_{i=1}^n (u_i + v_i)
\end{equation*}
\noindent subject to
\begin{equation*}
x_{i1}b_1 + x_{i2}b_2 + ... + x_{iq}b_q + U_i - v_i = y_i \quad \text{for}\
i = 1, 2, ..., n
\end{equation*}
\begin{equation*}
\text{and}\quad U_i \geq 0, v_i \geq 0\quad \text{for}\ i = 1, 2, ..., n.
\end{equation*}
\noindent The $LAR$ fitting problem thus takes the form of a linear
programming problem and is often solved by means of a variant of the dual
simplex procedure.
% \vspace{0.1cm}
Gauss has noted (when $q = 2$) that solutions of this problem are
characterised by the presence of a set of $q$ zero residuals. Such solutions
are robust to the presence of outlying observations. Indeed, they remain
constant under variations in the other $n - q$ observations provided that
these variations do not cause any of the residuals to change their signs.
% \vspace{0.1cm}
The $LAR$ fitting procedure corresponds to the maximum likelihood estimator
when the $\epsilon$-disturbances follow a double exponential (Laplacian)
distribution. This estimator is more robust to the presence of outlying
observations than is the standard least squares estimator which maximises
the likelihood function when the $\epsilon$-disturbances are normal
(Gaussian). Nevertheless, the $LAR$ estimator has an asymptotic normal
distribution as it is a member of Huber's class of $M$-estimators.
% \vspace{0.1cm}
There are many variants of the basic $LAR$ procedure but the one of greatest
historical interest is that proposed in 1760 by the Croatian Jesuit
scientist Rugjer (or Rudjer) Josip Bo\v{s}kovi\'{c} (1711--1787) (Latin:
Rogerius Josephus Boscovich; Italian: Ruggiero Giuseppe Boscovich). In his
variant of the standard $LAR$ procedure, there are two explanatory variables
of which the first is constant $x_{i1}=1$ and the values of $b_{1}$ and $%
b_{2}$ are constrained to satisfy the adding-up condition $%
\sum_{i=1}^{n}(y_{i}-b_{1}-x_{i2}b_{2})=0$ usually associated with the least
squares procedure developed by Gauss in 1795 and published by Legendre in
1805. Computer algorithms implementing this variant of the $LAR$ procedure
with $q \geq 2$ variables are still to be found in the literature.
% \vspace{0.1cm}
For an account of recent developments in this area, see the series of
volumes edited by Dodge (1987, 1992, 1997, 2002). For a detailed history of
the $LAR$ procedure, analysing the contributions of Bo\v{s}kovi\'{c},
Laplace, Gauss, Edgeworth, Turner, Bowley and Rhodes, see Farebrother
(1999). And, for a discussion of the geometrical and mechanical
representation of the least squares and $LAR$ fitting procedures, see
Farebrother (2002).
\vspace{0.5cm}
\textbf{References}
\vspace{0.2cm}
Yadolah Dodge (Ed.) (1987),\textit{\ Statistical Data Analysis Based on the }%
$L_{1}$\textit{-Norm and Related Methods}, North-Holland Publishing Company,
Amsterdam, The Netherlands.
% \vspace{0.2cm}
Yadolah Dodge (Ed.) (1992), $L_{1}$\textit{-Statistical Analysis and Related
Methods}, North-Holland Publishing Company, Amsterdam, The Netherlands.
% \vspace{0.2cm}
Yadolah Dodge (Ed.) (1997),\ $L_{1}$\textit{-Statistical Procedures and
Related Topics,} Institute of Mathematical Statistics, Hayward, California,
USA.
% \vspace{0.2cm}
Yadolah Dodge (Ed.) (2002), \textit{Statistical Data Analysis based on the $%
L_{1}$-Norm and Related Methods,} Birkh\"{a}user Publishing, Basel,
Switzerland.
% \vspace{0.2cm}
% Richard William Farebrother (1999), \textit{Fitting Linear Relationships: A
% History of the Calculus of Observations 1750--1900}, Springer-Verlag, New
% York, USA.
\vspace{0.2cm}
Richard William Farebrother (2002), \textit{Visualizing Statistical Models
and Concepts}, Marcel Dekker, New York, USA.
\vspace{0.5cm}
Reprinted with permission from Lovric, Miodrag (2011), International
Encyclopedia of Statistical Science. Heidelberg: Springer Science +Business
Media, LLC
\vspace{0.5cm}
Richard William Farebrother
\end{document}