Box-Cox transformation

Transformations of data designated to achieve a specified purpose, e.g., stability of variance, additivity of effects and symmetry of the density. If one is successful in finding a suitable transformation, the ordinary method for analysis will be available. Among the many parametric transformations, the family in [a1] is commonly utilized.

Let $ X $ be a random variable on the positive half-line. Then the Box–Cox transformation of $ X $ with power parameter $ \lambda $ is defined by:

$$ X ^ {( \lambda ) } = \left \{ \begin{array}{l} { { \frac{X ^ \lambda - 1 } \lambda } \ \textrm{ for } \lambda \neq0, } \\ { { \mathop{\rm log} } X \ \textrm{ for } \lambda = 0. } \end{array} \right . $$

The formula $ { {( x ^ \lambda - 1 ) } / \lambda } $ is chosen so that $ x ^ {( \lambda ) } $ is continuous as $ \lambda $ tends to zero and monotone increasing with respect to $ x $ for any $ \lambda $.

The power parameter $ \lambda $ is estimated by a graphical technique or by the maximum-likelihood method. Unfortunately, a closed form for the estimator $ {\widehat \lambda } $ can be rarely found. Hence, the plot of the maximum likelihood against $ \lambda $ is helpful. The value of $ {\widehat \lambda } $ obtained in this way is treated as if it were a true value, and then one fits the model to the transformed data. Such an approach may be easily carried out, and an asymptotic theory associated with other parameters is useful. See [a1] and [a3].

This treatment has, however, some difficulties because $ {\widehat \lambda } $ has a variability and depends on the given data itself. It is known that estimation of $ \lambda $ by maximum likelihood and related likelihood-ratio tests can be heavily influenced by outliers (cf. also Outlier). Further, in certain situations, the usual limiting theory based on knowing $ \lambda $ does not hold in the unknown case. Therefore, several robust estimation procedures have been proposed (see Robust statistics; and [a5] and references therein).

In the literature, Box–Cox transformations are applied to basic distributions, e.g., the cubic root transformation of chi-squared variates is used for acceleration to normality (cf. also Normal distribution), and the square-root transformation stabilizes variances of Poisson distributions (cf. also Poisson distribution). These results are unified by appealing to features of the following family of distributions.

Consider a collection of densities of the form

$$ a ( x; \phi ) { \mathop{\rm exp} } \left [ { \frac{\theta x - \kappa _ \alpha ( \theta ) } \phi } \right ] $$

satisfying $ \kappa _ \alpha ^ {\prime \prime } ( \theta ) = \kappa _ \alpha ^ \prime ( \theta ) ^ {p} $ with $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $. This family is called an exponential dispersion model with power variance function (EDM-PVF) of index $ \alpha $. The existence of such a model was shown in [a2] unless $ \alpha > 2 $ or $ \alpha = 1 $. It is a flexible family, including the normal, Poisson, gamma-, inverse Gaussian, etc., distributions.

It is known that both of the normalizing and the variance-stabilizing transformations of the exponential dispersion model with power variance function are given by Box–Cox transformations, see [a4]. If $ Y $ follows the exponential dispersion model with power variance function and with index $ \alpha $, the normalizing and variance-stabilizing transformations are given by $ Y ^ {( q ) } $, respectively $ Y ^ {( r ) } $, where $ q $( the power for normalization) and $ r $( the power for variance-stabilization) are summarized in the Table below (recall that $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $). The similar characteristics of familiar distributions are also tabulated there. For $ 0 < \alpha < 1 $, it has been proved in [a4] that the density of $ Y ^ {( q ) } $

has a uniformly convergent Gram–Charlier expansion (cf. also Gram–Charlier series). This implies that the normalizing transformation which is obtained by reducing the third-order cumulant reduces all higher-order cumulants as a result (cf. also Cumulant).

Distribution	index	$ p $	$ q $	$ r $

Normal	2	$ 0 $	$ 1 $	$ 1 $
Poisson	$ - \infty $	$ 1 $	$ {2 / 3 } $	$ {1 / 2 } $
Gamma	$ 0 $	$ 2 $	$ {1 / 3 } $	$ 0 $
Inverse Gaussian	$ {1 / 2 } $	$ 3 $	$ 0 $	$ - {1 / 2 } $
EDM-PVF	$ \alpha $	$ { \frac{2 - \alpha }{1 - \alpha } } $	$ { \frac{1 -2 \alpha }{3 -3 \alpha } } $	$ - { \frac \alpha {2 -2 \alpha } } $

Box–Cox transformations are also applied to link functions in generalized linear models. The transformations mainly aim to get the linearity of effects of covariates. See [a3] for further detail. Generalized Box–Cox transformations for random variables and link functions can be found in [a5].

References

[a1]	G.E.P. Box, D.R. Cox, "An analysis of transformations" J. Roy. Statist. Soc. B , 26 (1964) pp. 211–252
[a2]	B. Jørgensen, "Exponential dispersion models" J. Roy. Statist. Soc. B , 49 (1987) pp. 127–162
[a3]	P. McCullagh, J.A. Nelder, "Generalized linear models" , Chapman and Hall (1990) (Edition: Second)
[a4]	R. Nishii, "Convergence of the Gram–Charlier expansion after the normalizing Box–Cox transformation" Ann. Inst. Statist. Math. , 45 : 1 (1993) pp. 173–186
[a5]	G.A.F. Seber, C.J. Wild, "Nonlinear regression" , Wiley (1989)

How to Cite This Entry:
Box–Cox transformation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Box%E2%80%93Cox_transformation&oldid=22178

Navigation

Tools

Namespaces

Variants

Views

Actions

Box-Cox transformation

References