# Linear mixed model

This article Linear Mixed Model was adapted from an original article by Geert Molenberghs, which appeared in StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies. The original article ([http://statprob.com/encyclopedia/LinearMixedModel.html StatProb Source], Local Files: pdf | tex) is copyrighted by the author(s), the article has been donated to Encyclopedia of Mathematics, and its further issues are under Creative Commons Attribution Share-Alike License'. All pages from StatProb are contained in the Category StatProb.

2010 Mathematics Subject Classification: Primary: 62-XX [MSN][ZBL]

$\newcommand{\yi}{\bm{y_i}}$ $\def\bm#1{\mathbf{#1}}$ $\newcommand{\bi}{\bm{b_i}}$

Linear Mixed Model

Geert Molenberghs

I-BioStat, Universiteit Hasselt & Katholieke Universiteit Leuven, Belgium.

In many studies, data are collected hierarchically. Not always do such data follow balanced, multivariate designs. For example, in repeated measurements may be taken at almost arbitrary time points, resulting in an extremely large number of time points at which only one or only a few measurements have been taken. Many of the parametric covariance models described so far may then contain too many parameters to make them useful in practice, while other, more parsimonious, models may be based on assumptions which are too simplistic to be realistic. A general, and very flexible, class of parametric models for continuous longitudinal data is formulated as follows: \begin{eqnarray} \yi | \bi & \sim & N(X_i \bm\beta + Z_i \bm{b_i}, \Sigma_i), \label{lin mix eff model 1} \\ \bi & \sim & N(\bm{0}, D), \label{lin mix eff model 2} \end{eqnarray} where $X_i$ and $Z_i$ are $(n_i \times p)$ and $(n_i \times q)$ dimensional matrices of known covariates, $\bm\beta$ is a $p$-dimensional vector of regression parameters, called the fixed effects, $D$ is a general $(q \times q)$ covariance matrix, and $\Sigma_i$ is a $(n_i \times n_i)$ covariance matrix which depends on $i$ only through its dimension $n_i$, i.e., the set of unknown parameters in $\Sigma_i$ will not depend upon $i$. Finally, $\bm{b_i}$ is a vector of subject-specific or random effects.

The above model can be interpreted as a linear regression model for the vector $\yi$ of repeated measurements for each unit separately, where some of the regression parameters are specific (random effects, $\bi$), while others are not (fixed effects, $\bm\beta$). The distributional assumptions in \ref{lin mix eff model 1} and \ref{lin mix eff model 2} with respect to the random effects can be motivated as follows. First, $\mbox{E}(\bi)=\bm{0}$ implies that the mean of $\yi$ still equals $X_i\bm\beta$, such that the fixed effects in the random-effects model \ref{lin mix eff model 1} can also be interpreted marginally. Not only do they reflect the effect of changing covariates within specific units, they also measure the marginal effect in the population of changing the same covariates. Second, the normality assumption immediately implies that, marginally, $\yi$ also follows a normal distribution with mean vector $X_i \bm\beta$ and with covariance matrix $V_i = Z_i D Z_i' + \Sigma_i$.

Note that the random effects in \ref{lin mix eff model 1} implicitly imply the marginal covariance matrix $V_i$ of $\yi$ to be of the very specific form $V_i=Z_i D Z_i' + \Sigma_i$. Let us consider two examples under the assumption of conditional independence, i.e., assuming $\Sigma_i=\sigma^2 I_{n_i}$. First, consider the case where the random effects are univariate and represent unit-specific intercepts. This corresponds to covariates $Z_i$ which are $n_i$-dimensional vectors containing only ones.

The marginal model implied by expressions \ref{lin mix eff model 1} and \ref{lin mix eff model 2} is \begin{eqnarray*} \yi & \sim & N(X_i \bm\beta, V_i), \quad V_i = Z_i D Z_i' + \Sigma_i \end{eqnarray*} which can be viewed as a multivariate linear regression model, with a very particular parameterization of the covariance matrix $V_i$.

With respect to the estimation of unit-specific parameters $\bi$, the posterior distribution of $\bi$ given the observed data $\yi$ can be shown to be (multivariate) normal with mean vector equal to $D Z_i' V_i^{-1}(\bm\alpha) (\yi - X_i \bm\beta)$. Replacing $\bm\beta$ and $\bm\alpha$ by their maximum likelihood estimates, we obtain the so-called empirical Bayes estimates $\widehat{\bi}$ for the $\bi$. A key property of these EB estimates is shrinkage, which is best illustrated by considering the prediction $\widehat{\yi} \equiv X_i \widehat{\bm\beta} + Z_i \widehat{\bi}$ of the $i$th profile. It can easily be shown that \begin{eqnarray*} \widehat{\yi} & = & \Sigma_i V_i^{-1} X_i \widehat{\bm\beta} + \ \left(I_{n_i} - \Sigma_i V_i^{-1} \right) \yi, \end{eqnarray*} which can be interpreted as a weighted average of the population-averaged profile $X_i \widehat{\bm\beta}$ and the observed data $\yi$, with weights $\Sigma_i V_i^{-1}$ and $I_{n_i} - \Sigma_i V_i^{-1}$, respectively. Note that the "numerator" of $\Sigma_i V_i^{-1}$ represents within-unit variability and the "denominator" is the overall covariance matrix $V_i$. Hence, much weight will be given to the overall average profile if the within-unit variability is large in comparison to the between-unit variability (modeled by the random effects), whereas much weight will be given to the observed data if the opposite is true. This phenomenon is referred to as shrinkage toward the average profile $X_i \widehat{\bm\beta}$. An immediate consequence of shrinkage is that the EB estimates show less variability than actually present in the random-effects distribution, i.e., for any linear combination $\bm\lambda$ of the random effects, \begin{eqnarray*} \mbox{var}(\bm\lambda' \widehat{\bi}) & \leq & \mbox{var}(\bm\lambda' \bi) = \bm\lambda' D \bm\lambda. \end{eqnarray*}

How to Cite This Entry:
Linear mixed model. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Linear_mixed_model&oldid=38546