Namespaces
Variants
Actions

Linear mixed model

From Encyclopedia of Mathematics
Revision as of 11:03, 4 April 2016 by Ulf Rehmann (talk | contribs) (→‎changes proposed by author, typos)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Copyright notice
This article Linear Mixed Model was adapted from an original article by Geert Molenberghs, which appeared in StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies. The original article ([http://statprob.com/encyclopedia/LinearMixedModel.html StatProb Source], Local Files: pdf | tex) is copyrighted by the author(s), the article has been donated to Encyclopedia of Mathematics, and its further issues are under Creative Commons Attribution Share-Alike License'. All pages from StatProb are contained in the Category StatProb.

2020 Mathematics Subject Classification: Primary: 62-XX [MSN][ZBL]


\newcommand{\yi}{\bm{y_i}} \def\bm#1{\mathbf{#1}} \newcommand{\bi}{\bm{b_i}}

Linear Mixed Model

Geert Molenberghs


I-BioStat, Universiteit Hasselt & Katholieke Universiteit Leuven, Belgium.

In many studies, data are collected hierarchically. Not always do such data follow balanced, multivariate designs. For example, in repeated measurements may be taken at almost arbitrary time points, resulting in an extremely large number of time points at which only one or only a few measurements have been taken. Many of the parametric covariance models described so far may then contain too many parameters to make them useful in practice, while other, more parsimonious, models may be based on assumptions which are too simplistic to be realistic. A general, and very flexible, class of parametric models for continuous longitudinal data is formulated as follows: \begin{eqnarray} \yi | \bi & \sim & N(X_i \bm\beta + Z_i \bm{b_i}, \Sigma_i), \label{lin mix eff model 1} \\ \bi & \sim & N(\bm{0}, D), \label{lin mix eff model 2} \end{eqnarray} where X_i and Z_i are (n_i \times p) and (n_i \times q) dimensional matrices of known covariates, \bm\beta is a p-dimensional vector of regression parameters, called the fixed effects, D is a general (q \times q) covariance matrix, and \Sigma_i is a (n_i \times n_i) covariance matrix which depends on i only through its dimension n_i, i.e., the set of unknown parameters in \Sigma_i will not depend upon i. Finally, \bm{b_i} is a vector of subject-specific or random effects.

The above model can be interpreted as a linear regression model for the vector \yi of repeated measurements for each unit separately, where some of the regression parameters are specific (random effects, \bi), while others are not (fixed effects, \bm\beta). The distributional assumptions in \ref{lin mix eff model 1} and \ref{lin mix eff model 2} with respect to the random effects can be motivated as follows. First, \mbox{E}(\bi)=\bm{0} implies that the mean of \yi still equals X_i\bm\beta, such that the fixed effects in the random-effects model \ref{lin mix eff model 1} can also be interpreted marginally. Not only do they reflect the effect of changing covariates within specific units, they also measure the marginal effect in the population of changing the same covariates. Second, the normality assumption immediately implies that, marginally, \yi also follows a normal distribution with mean vector X_i \bm\beta and with covariance matrix V_i = Z_i D Z_i' + \Sigma_i.

Note that the random effects in \ref{lin mix eff model 1} implicitly imply the marginal covariance matrix V_i of \yi to be of the very specific form V_i=Z_i D Z_i' + \Sigma_i. Let us consider two examples under the assumption of conditional independence, i.e., assuming \Sigma_i=\sigma^2 I_{n_i}. First, consider the case where the random effects are univariate and represent unit-specific intercepts. This corresponds to covariates Z_i which are n_i-dimensional vectors containing only ones.

The marginal model implied by expressions \ref{lin mix eff model 1} and \ref{lin mix eff model 2} is \begin{eqnarray*} \yi & \sim & N(X_i \bm\beta, V_i), \quad V_i = Z_i D Z_i' + \Sigma_i \end{eqnarray*} which can be viewed as a multivariate linear regression model, with a very particular parameterization of the covariance matrix V_i.

With respect to the estimation of unit-specific parameters \bi, the posterior distribution of \bi given the observed data \yi can be shown to be (multivariate) normal with mean vector equal to D Z_i' V_i^{-1}(\bm\alpha) (\yi - X_i \bm\beta). Replacing \bm\beta and \bm\alpha by their maximum likelihood estimates, we obtain the so-called empirical Bayes estimates \widehat{\bi} for the \bi. A key property of these EB estimates is shrinkage, which is best illustrated by considering the prediction \widehat{\yi} \equiv X_i \widehat{\bm\beta} + Z_i \widehat{\bi} of the ith profile. It can easily be shown that \begin{eqnarray*} \widehat{\yi} & = & \Sigma_i V_i^{-1} X_i \widehat{\bm\beta} + \ \left(I_{n_i} - \Sigma_i V_i^{-1} \right) \yi, \end{eqnarray*} which can be interpreted as a weighted average of the population-averaged profile X_i \widehat{\bm\beta} and the observed data \yi, with weights \Sigma_i V_i^{-1} and I_{n_i} - \Sigma_i V_i^{-1}, respectively. Note that the "numerator" of \Sigma_i V_i^{-1} represents within-unit variability and the "denominator" is the overall covariance matrix V_i. Hence, much weight will be given to the overall average profile if the within-unit variability is large in comparison to the between-unit variability (modeled by the random effects), whereas much weight will be given to the observed data if the opposite is true. This phenomenon is referred to as shrinkage toward the average profile X_i \widehat{\bm\beta}. An immediate consequence of shrinkage is that the EB estimates show less variability than actually present in the random-effects distribution, i.e., for any linear combination \bm\lambda of the random effects, \begin{eqnarray*} \mbox{var}(\bm\lambda' \widehat{\bi}) & \leq & \mbox{var}(\bm\lambda' \bi) = \bm\lambda' D \bm\lambda. \end{eqnarray*}


References

[1] Fitzmaurice, G.M., Davidian, M., Verbeke, G., and Molenberghs, G.(2009). Longitudinal Data Analysis. Handbook. Hoboken, NJ: John Wiley & Sons.
[2] Fitzmaurice, G.M., Laird, N.M., and Ware, J.H. (2004). Applied Longitudinal Analysis. New York: John Wiley & Sons.
[3] Henderson, C.R. (1984) Applications of Linear Models in Animal Breeding. Guelph, Canada: University of Guelph Press.
[4] Verbeke, G. and Molenberghs, G. (2000) Linear Mixed Models for Longitudinal Data. New York: Springer.
  • Acknowledgment

Based on an article from Lovric, Miodrag (2011), International Encyclopedia of Statistical Science. Heidelberg: Springer Science +Business Media, LLC


How to Cite This Entry:
Linear mixed model. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Linear_mixed_model&oldid=38546