%%% Title of object: Linear Mixed Model
%%% Canonical Name: LinearMixedModel
%%% Type: Definition
%%% Created on: 2010-08-21 09:47:21
%%% Modified on: 2010-09-22 20:55:59
%%% Creator: lucp0904
%%% Modifier: lucp0904
%%% Author: lucp0904
%%% Author: jkimmel
%%% Author: akrowne
%%%
%%% Classification: msc:62-07
%%% Preamble:
\documentclass[10pt]{article}
\usepackage{graphicx}
%\parindent=0cm
%\parskip=0.5cm
%\textheight=23cm
%\textwidth=16cm
%\setlength{\oddsidemargin}{0cm}
%\setlength{\evensidemargin}{0cm}
%\setlength{\topmargin}{-1cm}
\newcommand{\yi}{\bm{y_i}}
%%\newcommand{\bm}[1]{\mbox{\boldmath $#1$}}
\def\bm#1{\mathbf{#1}}
%%\newcommand{\bm}[1]{\mbox{\bf #1}}
\newcommand{\bi}{\bm{b_i}}
% this is the default PlanetMath preamble. as your knowledge
% of TeX increases, you will probably want to edit this, but
% it should be fine as is for beginners.
% almost certainly you want these
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}
% used for TeXing text within eps files
%\usepackage{psfrag}
% need this for including graphics (\includegraphics)
%\usepackage{graphicx}
% for neatly defining theorems and propositions
%\usepackage{amsthm}
% making logically defined graphics
%\usepackage{xypic}
% there are many more packages, add them here as you need them
% define commands here
%%%Content:
\begin{document}
\begin{center}
{\LARGE Linear Mixed Model}
{\large Geert Molenberghs}\\[2mm]
{\large I-BioStat, Universiteit Hasselt \& Katholieke Universiteit Leuven, Belgium.}
\end{center}
In many studies, data are collected hierarchically. Not always do such data follow balanced, multivariate designs. For example, in repeated measurements may be taken at almost
arbitrary time points, resulting in an extremely large number of time points at
which only one or only a few measurements
have been taken. Many of the parametric covariance models described so far may
then contain too many parameters to make them useful in practice, while other,
more parsimonious, models may be based on assumptions which are too simplistic
to be realistic. A general, and very flexible, class of parametric models for continuous longitudinal data is formulated as follows:
\begin{eqnarray}
\yi | \bi & \sim & N(X_i \bm\beta + Z_i \bm{b_i}, \Sigma_i), \label{lin mix eff model 1} \\
\bi & \sim & N(\bm{0}, D),
\label{lin mix eff model 2}
\end{eqnarray}
where $X_i$ and $Z_i$ are $(n_i \times p)$ and $(n_i \times q)$ dimensional
matrices of known covariates, $\bm\beta$ is a $p$-dimensional vector of
regression parameters, called the fixed effects, $D$ is a general $(q \times
q)$ covariance matrix, and $\Sigma_i$ is a $(n_i \times n_i)$ covariance
matrix which depends on $i$ only through its dimension $n_i$, i.e., the set
of unknown parameters in $\Sigma_i$ will not depend upon $i$. Finally, $\bm{b_i}$ is a vector of subject-specific or random effects.
The above model can be interpreted as a linear regression model for the vector
$\yi$ of repeated measurements for each unit separately, where some of the
regression parameters are specific (random effects, $\bi$), while others are
not (fixed effects, $\bm\beta$). The distributional assumptions in (\ref{lin
mix eff model 2}) with respect to the random effects can be motivated as
follows. First, $\mbox{E}(\bi)=\bm{0}$ implies that the mean of $\yi$ still
equals $X_i\bm\beta$, such that the fixed effects in the random-effects
model (\ref{lin mix eff model 1}) can also be interpreted marginally. Not only
do they reflect the effect of changing covariates within specific units, they
also measure the marginal effect in the population of changing the same
covariates. Second, the
normality assumption immediately implies that, marginally, $\yi$ also follows
a normal distribution with mean vector $X_i \bm\beta$ and with covariance
matrix $V_i = Z_i D Z_i' + \Sigma_i$.
Note that the random effects in (\ref{lin mix eff model 1}) implicitly imply
the marginal covariance matrix $V_i$ of $\yi$ to be of the very specific form
$V_i=Z_i D Z_i' + \Sigma_i$. Let us consider two examples under the
assumption of conditional independence, i.e., assuming $\Sigma_i=\sigma^2
I_{n_i}$. First, consider the case where the random effects are univariate and
represent unit-specific intercepts. This corresponds to covariates $Z_i$ which
are $n_i$-dimensional vectors containing only ones.
The marginal model implied by expressions (\ref{lin mix eff model 1}) and
(\ref{lin mix eff model 2}) is
\begin{eqnarray*}
\yi & \sim & N(X_i \bm\beta, V_i), \quad V_i = Z_i D Z_i' + \Sigma_i
\end{eqnarray*}
which can be viewed as a multivariate linear regression model, with a
very particular parameterization of the covariance matrix $V_i$.
With respect to the estimation of unit-specific parameters $\bi$, the
posterior distribution of $\bi$ given the observed data $\yi$ can be shown to
be (multivariate) normal with mean vector equal to $D Z_i' V_i^{-1}(\bm\alpha)
(\yi - X_i \bm\beta)$. Replacing $\bm\beta$ and $\bm\alpha$ by their maximum
likelihood estimates, we obtain the so-called empirical Bayes estimates $\widehat{\bi}$ for the $\bi$. A key property of these EB
estimates is shrinkage, which is best illustrated by considering the
prediction $\widehat{\yi} \equiv X_i \widehat{\bm\beta} + Z_i
\widehat{\bi}$ of the $i$th profile.
It can easily be shown that
\begin{eqnarray*}
\widehat{\yi} & = & \Sigma_i V_i^{-1} X_i \widehat{\bm\beta} \ + \
\left(I_{n_i} - \Sigma_i V_i^{-1} \right) \yi,
\end{eqnarray*}
which can be interpreted as a weighted average of the population-averaged
profile $X_i \widehat{\bm\beta}$ and the observed data $\yi$, with weights
$\Sigma_i V_i^{-1}$ and $I_{n_i} - \Sigma_i V_i^{-1}$, respectively. Note that
the ``numerator'' of $\Sigma_i V_i^{-1}$ represents within-unit variability
and the ``denominator'' is the overall covariance matrix $V_i$. Hence, much
weight will be given to the overall average profile if the within-unit
variability is large in comparison to the between-unit variability (modeled by
the random effects), whereas much weight will be given to the observed data if
the opposite is true. This phenomenon is referred to as shrinkage toward the
average profile $X_i \widehat{\bm\beta}$. An immediate consequence of shrinkage
is that the EB estimates show less variability than actually present in the
random-effects distribution, i.e., for any linear combination $\bm\lambda$ of
the random effects,
\begin{eqnarray*}
\mbox{var}(\bm\lambda' \widehat{\bi}) & \leq & \mbox{var}(\bm\lambda' \bi) =
\bm\lambda' D \bm\lambda.
\end{eqnarray*}
\subsubsection*{Reference}
Fitzmaurice, G.M., Davidian, M., Verbeke, G., and Molenberghs, G.(2009). {\em Longitudinal Data Analysis. Handbook.} Hoboken, NJ: John Wiley \& Sons.
Fitzmaurice, G.M., Laird, N.M., and Ware, J.H. (2004). {\em Applied Longitudinal Analysis.}
New York: John Wiley \& Sons.
Henderson, C.R. (1984)
{\em Applications of Linear Models in Animal Breeding}.
Guelph, Canada: University of Guelph Press.
Verbeke, G. and Molenberghs, G. (2000) {\em Linear Mixed Models for Longitudinal Data.} New York: Springer.
\subsubsection*{Acknowledgment}
Based on an article from Lovric, Miodrag (2011), International Encyclopedia of Statistical Science. Heidelberg: Springer Science +Business Media, LLC
\end{document}