%%% Title of object: Generalized Quasi-likelihood (GQL) Inference
%%% Canonical Name: GeneralizedQuasiLikelihoodGQLInferences
%%% Type: Topic
%%% Created on: 2010-09-02 02:05:52
%%% Modified on: 2010-09-06 06:12:37
%%% Creator: brajendra
%%% Modifier: jkimmel
%%% Author: brajendra
%%%
%%% Classification: msc:62F10, msc:62H20
%%% Preamble:
\documentclass[10pt]{article}
% this is the default PlanetMath preamble. as your knowledge
% of TeX increases, you will probably want to edit this, but
% it should be fine as is for beginners.
% almost certainly you want these
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}
% used for TeXing text within eps files
%\usepackage{psfrag}
% need this for including graphics (\includegraphics)
%\usepackage{graphicx}
% for neatly defining theorems and propositions
%\usepackage{amsthm}
% making logically defined graphics
%\usepackage{xypic}
% there are many more packages, add them here as you need them
% define commands here
%%% Content:
%\documentstyle[12pt]{article}
%\renewcommand{\textwidth}{6.5in}
%\setlength{\oddsidemargin}{0in}
%\setlength{\evensidemargin}{\oddsidemargin}
%%\pagestyle{empty}
%%\renewcommand{\theequation}{\thesection.\arabic{equation}}
\begin{document}
%\setlength{\baselineskip}{20pt}
%\newcommand{\di}{\displaystyle}
\begin{center}
{\bf Generalized Quasi-likelihood (GQL) Inference*}\\
by Brajendra C. Sutradhar\\
{\it Memorial University}\\
{\it Email address:} bsutradh@mun.ca\\
\end{center}
\noindent {\bf QL Estimation for Independent Data.} For $i=1,\ldots,K,$ let $Y_i$ denote the response variable for the $i$th
individual, and $x_i=(x_{i1},\ldots,x_{iv},\ldots,x_{ip})'$ be the associated
$p-$dimensional covariate vector. Also, let $\beta$ be the $p-$dimensional
vector of regression effects of $x_i$ on $y_i.$ Further suppose that the
responses are collected from $K$ independent individuals. It is understandable
that if the probability distribution of $Y_i$ is not known, then one can not
use the well known likelihood approach to estimate the underlying regression parameter $\beta.$
Next suppose that only two moments of the data, that is, the mean and the variance functions of the response variable
$Y_i$ for all $i=1,\ldots,K,$ are known, and for a known functional form $a(\cdot)$, these moments are given by
\begin{equation}
E[Y_i]=a'(\theta_i)\;\mbox{and}\; \mbox{var}[Y_i]=a''(\theta_i),
\end{equation}
where for a link function $h(\cdot),$ $\theta_i=h(x'_i\beta),$ and $a'(\theta_i)$ and $a''(\theta_i)$
are the first and second order derivatives of $a(\theta_i),$ respectively, with respect to
$\theta_i.$ For the estimation of the regression parameter vector $\beta$ under this independence set up,
Wedderburn (1974) (see also McCullagh (1983))
proposed to solve the so-called quasi-likelihood (QL) estimating equation given by
\begin{equation}
\sum^K_{i=1}[\frac{\partial a'(\theta_{i})}{\partial
\beta}\frac{(y_{i}-a'(\theta_{i}))}{a''(\theta_i)}]=0.
\end{equation}
Let $\hat{\beta}_{QL}$ be the QL estimator of $\beta$ obtained from (2). It is known that
this estimator is consistent and highly efficient. In fact, for Poisson and binary data, for
example, $\hat{\beta}_{QL}$ is equivalent to the maximum likelihood (ML)
estimator and hence it turns out to be an optimal estimator.\\
\noindent {\bf Illustration for the Poisson case:} For the Poisson data, one uses
\begin{equation}
a(\theta_{i})=\exp(\theta_{i})
\end{equation}
with identity link function $h(\cdot),$ that is,
$\theta_{i}=x'_{i}\beta.$ This gives the mean and the variance functions as
$$\mbox{var}(Y_{i})=a''(\theta_{i})=E(Y_i)=a'(\theta_{i})=\mu_{i}\;\mbox{(say)}=\exp(x'_{i}\beta),$$
yielding by (2), the QL estimating equation for $\beta$ as
\begin{equation}
\sum^K_{i=1}x_i(y_i-\mu_i)=0.
\end{equation}
Note that as the Poisson density is given by $f(y_i|x_i)=\frac{1}{y_i!}\exp[y_ilog(\mu_i)-\mu_i],$
with $\mu_i=\exp(\theta_i)=\exp(x'_i\beta),$ it follows that the log likelihood
function of $\beta$ has the form
$\mbox{log}L(\beta)=-\sum^K_{i=1}log(y_i!)+\sum^K_{i=1}[y_{i}\theta_{i}-a(\theta_{i})],$
yielding the likelihood equation for $\beta$ as
\begin{equation}
\frac{\partial \mbox{log} L}{\partial
\beta}=\sum^K_{i=1}[y_{i}-a'(\theta_{i})]\frac{\partial
\theta_{i}}{\partial \beta}=\sum^K_{i=1}x_i(y_i-\mu_i)=0,
\end{equation}
which is the same as the QL estimating equation (4). Thus, if the likelihood function were known,
then the ML estimate of $\beta$ would be the same as the QL estimate $\hat{\beta}_{QL}.$\\
\noindent {\bf Illustration for the binary case:} For the binary data, one uses
\begin{equation}
a'(\theta_{i})=\frac{\exp(\theta_{i})}{1+\exp(\theta_{i})}=\mu_i\;\mbox{and}\;a''(\theta_i)=\mu_i(1-\mu_i),
\end{equation}
with $\theta_i=x'_i\beta.$ The QL estimating equation (2) for the binary data, however, provides the same formula (4) as in the Poisson case,
except that now for the binary case $\mu_i= \frac{\exp(\theta_{i})}{1+\exp(\theta_{i})},$ whereas for the Poisson case
$\mu_i=\exp(\theta_i).$
As far as the ML estimation for the binary case is concerned, one first writes the binary density given by
$f(y_i|x_i)={\mu_i}^{y_i}(1-\mu_i)^{1-y_i}.$
Next by writing the log likelihood function as
$\mbox{log}L(\beta)=\sum^K_{i=1}y_i\mu_i+\sum^K_{i=1}(1-y_i)(1-\mu_i),$
one obtains the same likelihood estimating equation as in (5), except that
here $\mu_i= \frac{\exp(x'_i\beta)}{1+\exp(x'_i\beta)},$ under the binary model.
Since the QL estimating equation (4) is the same as the ML estimating equation (5),
it then follows that the ML and QL estimates for $\beta$
would also be the same for the binary data.\\
\noindent {\bf GQL Estimation: A Generalization of the QL Estimation to the Correlated Data.}
As opposed to the independence set up, we now consider $y_i$ as a vector of $T$ repeated
binary or count responses, collected from the $i-$th individual, for all $i=1,\ldots,K.$
Let $y_i=(y_{i1},\ldots,y_{it},\ldots,y_{iT})',$ where $y_{it}$ represents the response
recorded at time $t$ for the $i$th individual. Also, let $x_{it}=(x_{it1},\ldots,x_{itv},\ldots,x_{itp})'$ be the $p-$dimensional
covariate vector corresponding to the scalar $y_{it},$ and $\beta$ be the $p-$dimensional regression
effects of $x_{it}$ on $y_{it}$ for all $i=1,\ldots,K,$ and all $t=1,\ldots,T.$
Suppose that $\mu_{it}$ and $\sigma_{itt}$ be the mean and the variance of $Y_{it},$ that is $\mu_{it}=E[Y_{it}]$ and
$\mbox{var}[Y_{it}]=\sigma_{itt}.$ Note that both $\mu_{it}$ and $\sigma_{itt}$ are functions of $\beta.$ But, when
the variance is a function of mean, it is sufficient to estimate $\beta$ involved in the mean function only, by treating
$\beta$ involved in the variance function to be known.
Further note that since the $T$ repeated responses of an
individual are likely to correlated, the estimate of $\beta$ to be obtained by ignoring the correlations, that is, the solution
of the independence assumption based QL estimating equation
\begin{equation}
\sum^K_{i=1}\sum^T_{t=1}[\frac{\partial \mu_{it}}{\partial
\beta}\frac{(y_{i}-\mu_{it})}{\sigma_{itt}}]=0,
\end{equation}
for $\beta,$ will be consistent but inefficient. As a remedy to this inefficient estimation
problem, Sutradhar (2003) has proposed a generalization of the QL estimation approach, where
$\beta$ is now obtained by solving the GQL estimating equation given by
\begin{equation}
\sum^K_{i=1} \frac{\partial \mu'_i}{\partial \beta}{\Sigma_i}^{-1}(\rho )(y_i-\mu_i)=0,
\end{equation}
where $\mu_i=(\mu_{i1},\ldots,\mu_{it},\ldots,\mu_{iT})'$ is the mean vector of $Y_i,$ and
$\Sigma_i(\rho)$ is the covariance matrix of $Y_i$ that can be expressed as $ \Sigma_i(\rho )=A^{\frac{1}{2}}_iC_i(\rho
)A^{\frac{1}{2}}_i$, with $A_i=\mbox{diag}[\sigma_{i11},\ldots,\sigma_{itt},\ldots,\sigma_{iTT}]$ and $C_i(\rho)$ as the correlation matrix of $Y_i,$
$\rho$ being a correlation index parameter.
Note that the use of the GQL estimating equation (8) requires the structure of the correlation matrix $C_i(\rho)$ to be
known, which is, however, unknown in practice. To overcome this difficulty, Sutradhar (2003) has suggested a general
stationary auto-correlation structure given by
\begin{equation} C_i(\rho)=\left[
\begin{array}{ccccc}
1 & \rho_1 & \rho_2 & \cdots & \rho_{T-1} \\ [2ex]
\rho_1 & 1 & \rho_1 & \cdots & \rho_{T-2} \\
\vdots & \vdots & \vdots && \vdots \\
\rho_{T-1} & \rho_{T-2} & \rho_{T-3} & \cdots & 1 \\
\end{array} \right] ,
\end{equation}
(see also Sutradhar and Das (1999, Section 3)), for all $i=1,\ldots,K,$ where for
$\ell=1,\ldots,T-1,$ $\rho_\ell$ represents the lag $\ell$
auto-correlation. As far as the estimation of the lag correlations is concerned, they may be
consistently estimated by using the well known method of moments.
For $\ell =|u-t|$, $u\neq t$, $u, t=1,\ldots ,T$, the moment
estimator for the autocorrelation of lag $\ell$, $\rho_{\ell}$,
has the formula
\begin{equation}
\hat{\rho}_\ell =
\frac{\sum^K_{i=1}\sum^{T-\ell}_{t=1}\tilde{y}_{it}\tilde{y}_{i,t+\ell}
/K(T-\ell )}{\sum^K_{i=1}\sum^T_{t=1}\tilde{y}^2_{it}/KT} ,
\end{equation}
(Sutradhar and Kovacevic (2000, eqn. (2.18), Sutradhar (2003)),
where $\tilde{y}_{it}$ is the standardized residual, defined as
$ \tilde{y}_{it}=(y_{it}-\mu_{it})/\{\sigma_{itt}
\}^{\frac{1}{2}}$.
The GQL estimating equation (8) for $\beta$ and the moment estimate of $\rho_\ell$ by
(10) are solved iteratively until convergence. The final estimate of $\beta$
obtained from this iterative process is referred to as the GQL estimate of $\beta,$ and may
be denoted by $\hat{\beta}_{GQL}.$ This estimator $\hat{\beta}_{GQL}$ is consistent for $\beta$ and also highly efficient,
the ML estimator being fully efficient which is however impossible or extremely complex to obtain in the
correlated data set up.
With regard to the generality of the stationary auto-correlation matrix $C_i(\rho)$ in (9), one may show that
this matrix, in fact, represents the correlations of many stationary dynamic such as stationary auto-regressive order 1 (AR(1)),
stationary moving average order 1 (MA(1)), and stationary equi-correlations (EQC) models. For example,
consider the stationary AR(1) model given by
\begin{equation}
y_{it}=\rho * y_{i,t-1}+d_{it},
\end{equation}
(McKenzie (1988), Sutradhar (2003)) where it is assumed that for
given $y_{i,t-1}$, $\rho * y_{i,t-1}$ denotes the so-called
binomial thinning operation (McKenzie, 1988). That is,
\begin{equation}
\rho * y_{i,t-1} = \sum^{y_{i,t-1}}_{j=1}b_j(\rho ) = z_{i,t-1}, {\mbox{say}},
\end{equation}
with $\Pr [b_j(\rho )=1]=\rho$ and $\Pr [b_j(\rho )=0]=1-\rho$.
Furthermore, it is assumed in (11) that $y_{i1}$ follows the Poisson distribution with
mean parameter $\mu_{i\cdot},$ that is, $y_{i1}\sim Poi(\mu_{i\cdot}),$ where
$\mu_{i\cdot}=\exp(x'_{i\cdot}\beta)$ with stationary covariate vector $x_{i\cdot}$ such that $x_{it}=x_{i\cdot}$ for all $t=1,\ldots,T.$
Further, in (11), $d_{it} \sim
P(\mu_{i\cdot}(1-\rho ))$ and is independent of $z_{i,t-1}.$ This model in (11) yields the
mean, variance and auto-correlations of the data as shown in Table 1. The Table 1 also
contains the MA(1) and EQC models and their basic properties including the correlation
structures.
\begin{table}
\noindent {\bf Table 1.} A class of stationary correlation models for longitudinal count data
and basic properties.
\begin{center}
\begin{tabular}{ccc}
Model & Dynamic relationship & Mean-variance \\
&& \& Correlations \\ \hline
AR(1) & $y_{it}=\rho * y_{i,t-1}+d_{it}, t=2,\ldots$ & $E[Y_{it}]=\mu_{i\cdot}$ \\
& $y_{i1}\sim Poi(\mu_{i\cdot})$ & $\mbox{var}[Y_{it}]=\mu_{i\cdot}$ \\
& $d_{it} \sim P(\mu_{i\cdot}(1-\rho )), t=2,\ldots$& $\mbox{corr}[Y_{it},Y_{i,t+\ell}]=\rho_{\ell}$ \\
&& $=\rho^{\ell}$ \\ \hline
MA(1) & $y_{it}=\rho * d_{i,t-1}+d_{it}, t=2,\ldots$ & $E[Y_{it}]=\mu_{i\cdot}$ \\
& $y_{i1}=d_{i1} \sim Poi(\mu_{i\cdot}/(1+\rho))$ & $\mbox{var}[Y_{it}]=\mu_{i\cdot}$ \\
& $d_{it} \sim P(\mu_{i\cdot}/(1+\rho )), t=2,\ldots$& $\mbox{corr}[Y_{it},Y_{i,t+\ell}]=\rho_{\ell}$ \\
&& $=
\left\{ \begin{array}{ll}
\frac{\rho}{1+\rho} & \mbox{for } \ell=1\\
0 & \mbox{otherwise},
\end{array} \right.$
\\ \hline
EQC & $y_{it}=\rho * y_{i1}+d_{it}, t=2,\ldots$ & $E[Y_{it}]=\mu_{i\cdot}$ \\
& $y_{i1}\sim Poi(\mu_{i\cdot})$ & $\mbox{var}[Y_{it}]=\mu_{i\cdot}$ \\
& $d_{it} \sim P(\mu_{i\cdot}(1-\rho )), t=2,\ldots$& $\mbox{corr}[Y_{it},Y_{i,t+\ell}]=\rho_{\ell}$ \\
&& $=\rho$ \\ \hline
\end{tabular}
\end{center}
\end{table}
It is clear from Table 1 that the correlation structures for all three processes can be represented
by $C_i(\rho)$ in (9). By following Qaqish (2003), one may write similar but different dynamic models
for the repeated binary data, with their correlation structures represented by $C_i(\rho).$ Thus, if
the count or binary data follow this type of auto-correlations model, one may then certainly estimate
the regression vector consistently and efficiently by solving the general auto-correlations matrix based
GQL estimating equation (8), where the lag correlations are
estimated by (10) consistently.\\
\noindent [* Reprinted with permission from Lovric, Miodrag (2011), International
Encyclopedia of Statistical Science. Heidelberg: Springer Science \& Business
Media, LLC]
\medskip
%\begin{center}
\section*{References}
%\end{center}
\begin{description}
\item[] McCullagh, P. (1983). Quasilikelihood functions. {\it Ann.
Statist.} 11, 59-67. \item[] McKenzie, E. (1988). Some ARMA models
for dependent sequences of Poisson counts. {\it Advances in
Applied Probability} 20, 822-35.
\item[] Qaqish, B. F. (2003). A family of multivariate binary distributions for
simulating correlated binary variables with specified marginal means and correlations.
{\it Biometrika} 90, 455-463.
\item Sutradhar, B. C. (2003). An overview on
regression models for discrete longitudinal responses. {\it
Statistical Science} 18, 377-93. \item[] Sutradhar, B. C. \& Das,
K. (1999). On the efficiency of regression estimators in
generalized linear models for longitudinal data. {\it Biometrika}
86, 459-65. \item[] Sutradhar, B. C. \& Kovacevic, M. (2000).
Analyzing ordinal longitudinal survey data: Generalized estimating
equations approach. {\it Biometrika} 87, 837-848.
\item[] Wedderburn, R. W. M.
(1974). Quasi-likelihood functions, generalised linear models, and
the Gauss-Newton method. {\it Biometrika} 61, 439-447.
\end{description}
\end{document}