Difference between revisions of "Box-Cox transformation"
(Importing text file) |
Ulf Rehmann (talk | contribs) m (tex encoded by computer) |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | <!-- | ||
+ | b1107901.png | ||
+ | $#A+1 = 53 n = 0 | ||
+ | $#C+1 = 53 : ~/encyclopedia/old_files/data/B110/B.1100790 Box\ANDCox transformation | ||
+ | Automatically converted into TeX, above some diagnostics. | ||
+ | Please remove this comment and the {{TEX|auto}} line below, | ||
+ | if TeX found to be correct. | ||
+ | --> | ||
+ | |||
+ | {{TEX|auto}} | ||
+ | {{TEX|done}} | ||
+ | |||
Transformations of data designated to achieve a specified purpose, e.g., stability of variance, additivity of effects and symmetry of the density. If one is successful in finding a suitable transformation, the ordinary method for analysis will be available. Among the many parametric transformations, the family in [[#References|[a1]]] is commonly utilized. | Transformations of data designated to achieve a specified purpose, e.g., stability of variance, additivity of effects and symmetry of the density. If one is successful in finding a suitable transformation, the ordinary method for analysis will be available. Among the many parametric transformations, the family in [[#References|[a1]]] is commonly utilized. | ||
− | Let | + | Let $ X $ |
+ | be a [[Random variable|random variable]] on the positive half-line. Then the Box–Cox transformation of $ X $ | ||
+ | with power parameter $ \lambda $ | ||
+ | is defined by: | ||
− | + | $$ | |
+ | X ^ {( \lambda ) } = \left \{ | ||
+ | \begin{array}{l} | ||
+ | { { | ||
+ | \frac{X ^ \lambda - 1 } \lambda | ||
+ | } \ \textrm{ for } \lambda \neq0, } \\ | ||
+ | { { \mathop{\rm log} } X \ \textrm{ for } \lambda = 0. } | ||
+ | \end{array} | ||
+ | \right . | ||
+ | $$ | ||
− | The formula | + | The formula $ { {( x ^ \lambda - 1 ) } / \lambda } $ |
+ | is chosen so that $ x ^ {( \lambda ) } $ | ||
+ | is continuous as $ \lambda $ | ||
+ | tends to zero and monotone increasing with respect to $ x $ | ||
+ | for any $ \lambda $. | ||
− | The power parameter | + | The power parameter $ \lambda $ |
+ | is estimated by a graphical technique or by the [[Maximum-likelihood method|maximum-likelihood method]]. Unfortunately, a closed form for the estimator $ {\widehat \lambda } $ | ||
+ | can be rarely found. Hence, the plot of the maximum likelihood against $ \lambda $ | ||
+ | is helpful. The value of $ {\widehat \lambda } $ | ||
+ | obtained in this way is treated as if it were a true value, and then one fits the model to the transformed data. Such an approach may be easily carried out, and an asymptotic theory associated with other parameters is useful. See [[#References|[a1]]] and [[#References|[a3]]]. | ||
− | This treatment has, however, some difficulties because | + | This treatment has, however, some difficulties because $ {\widehat \lambda } $ |
+ | has a variability and depends on the given data itself. It is known that estimation of $ \lambda $ | ||
+ | by maximum likelihood and related likelihood-ratio tests can be heavily influenced by outliers (cf. also [[Outlier|Outlier]]). Further, in certain situations, the usual limiting theory based on knowing $ \lambda $ | ||
+ | does not hold in the unknown case. Therefore, several robust estimation procedures have been proposed (see [[Robust statistics|Robust statistics]]; and [[#References|[a5]]] and references therein). | ||
In the literature, Box–Cox transformations are applied to basic distributions, e.g., the cubic root transformation of chi-squared variates is used for acceleration to normality (cf. also [[Normal distribution|Normal distribution]]), and the square-root transformation stabilizes variances of Poisson distributions (cf. also [[Poisson distribution|Poisson distribution]]). These results are unified by appealing to features of the following family of distributions. | In the literature, Box–Cox transformations are applied to basic distributions, e.g., the cubic root transformation of chi-squared variates is used for acceleration to normality (cf. also [[Normal distribution|Normal distribution]]), and the square-root transformation stabilizes variances of Poisson distributions (cf. also [[Poisson distribution|Poisson distribution]]). These results are unified by appealing to features of the following family of distributions. | ||
Line 15: | Line 50: | ||
Consider a collection of densities of the form | Consider a collection of densities of the form | ||
− | + | $$ | |
+ | a ( x; \phi ) { \mathop{\rm exp} } \left [ { | ||
+ | \frac{\theta x - \kappa _ \alpha ( \theta ) } \phi | ||
+ | } \right ] | ||
+ | $$ | ||
− | satisfying | + | satisfying $ \kappa _ \alpha ^ {\prime \prime } ( \theta ) = \kappa _ \alpha ^ \prime ( \theta ) ^ {p} $ |
+ | with $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $. | ||
+ | This family is called an exponential dispersion model with power variance function (EDM-PVF) of index $ \alpha $. | ||
+ | The existence of such a model was shown in [[#References|[a2]]] unless $ \alpha > 2 $ | ||
+ | or $ \alpha = 1 $. | ||
+ | It is a flexible family, including the normal, Poisson, gamma-, inverse Gaussian, etc., distributions. | ||
− | It is known that both of the normalizing and the variance-stabilizing transformations of the exponential dispersion model with power variance function are given by Box–Cox transformations, see [[#References|[a4]]]. If | + | It is known that both of the normalizing and the variance-stabilizing transformations of the exponential dispersion model with power variance function are given by Box–Cox transformations, see [[#References|[a4]]]. If $ Y $ |
+ | follows the exponential dispersion model with power variance function and with index $ \alpha $, | ||
+ | the normalizing and variance-stabilizing transformations are given by $ Y ^ {( q ) } $, | ||
+ | respectively $ Y ^ {( r ) } $, | ||
+ | where $ q $( | ||
+ | the power for normalization) and $ r $( | ||
+ | the power for variance-stabilization) are summarized in the Table below (recall that $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $). | ||
+ | The similar characteristics of familiar distributions are also tabulated there. For $ 0 < \alpha < 1 $, | ||
+ | it has been proved in [[#References|[a4]]] that the density of $ Y ^ {( q ) } $ | ||
+ | has a uniformly convergent Gram–Charlier expansion (cf. also [[Gram–Charlier series|Gram–Charlier series]]). This implies that the normalizing transformation which is obtained by reducing the third-order cumulant reduces all higher-order cumulants as a result (cf. also [[Cumulant|Cumulant]]).<table border="0" cellpadding="0" cellspacing="0" style="background-color:black;"> <tr><td> <table border="0" cellspacing="1" cellpadding="4" style="background-color:black;"> <tbody> <tr> <td colname="1" style="background-color:white;" colspan="1">Distribution</td> <td colname="2" style="background-color:white;" colspan="1">index</td> <td colname="3" style="background-color:white;" colspan="1"> $ p $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ q $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ r $ | ||
+ | </td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1"></td> <td colname="5" style="background-color:white;" colspan="1"></td> <td colname="4" style="background-color:white;" colspan="1"></td> <td colname="3" style="background-color:white;" colspan="1"></td> <td colname="2" style="background-color:white;" colspan="1"></td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1">Normal</td> <td colname="2" style="background-color:white;" colspan="1">2</td> <td colname="3" style="background-color:white;" colspan="1"> $ 0 $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ 1 $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ 1 $ | ||
+ | </td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1">Poisson</td> <td colname="2" style="background-color:white;" colspan="1"> $ - \infty $ | ||
+ | </td> <td colname="3" style="background-color:white;" colspan="1"> $ 1 $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ {2 / 3 } $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ {1 / 2 } $ | ||
+ | </td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1">Gamma</td> <td colname="2" style="background-color:white;" colspan="1"> $ 0 $ | ||
+ | </td> <td colname="3" style="background-color:white;" colspan="1"> $ 2 $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ {1 / 3 } $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ 0 $ | ||
+ | </td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1">Inverse Gaussian</td> <td colname="2" style="background-color:white;" colspan="1"> $ {1 / 2 } $ | ||
+ | </td> <td colname="3" style="background-color:white;" colspan="1"> $ 3 $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ 0 $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ - {1 / 2 } $ | ||
+ | </td> </tr> <tr> <td colname="1" style="background-color:white;" colspan="1">EDM-PVF</td> <td colname="2" style="background-color:white;" colspan="1"> $ \alpha $ | ||
+ | </td> <td colname="3" style="background-color:white;" colspan="1"> $ { | ||
+ | \frac{2 - \alpha }{1 - \alpha } | ||
+ | } $ | ||
+ | </td> <td colname="4" style="background-color:white;" colspan="1"> $ { | ||
+ | \frac{1 -2 \alpha }{3 -3 \alpha } | ||
+ | } $ | ||
+ | </td> <td colname="5" style="background-color:white;" colspan="1"> $ - { | ||
+ | \frac \alpha {2 -2 \alpha } | ||
+ | } $ | ||
+ | </td> </tr> </tbody> </table> | ||
</td></tr> </table> | </td></tr> </table> |
Latest revision as of 06:29, 30 May 2020
Transformations of data designated to achieve a specified purpose, e.g., stability of variance, additivity of effects and symmetry of the density. If one is successful in finding a suitable transformation, the ordinary method for analysis will be available. Among the many parametric transformations, the family in [a1] is commonly utilized.
Let $ X $ be a random variable on the positive half-line. Then the Box–Cox transformation of $ X $ with power parameter $ \lambda $ is defined by:
$$ X ^ {( \lambda ) } = \left \{ \begin{array}{l} { { \frac{X ^ \lambda - 1 } \lambda } \ \textrm{ for } \lambda \neq0, } \\ { { \mathop{\rm log} } X \ \textrm{ for } \lambda = 0. } \end{array} \right . $$
The formula $ { {( x ^ \lambda - 1 ) } / \lambda } $ is chosen so that $ x ^ {( \lambda ) } $ is continuous as $ \lambda $ tends to zero and monotone increasing with respect to $ x $ for any $ \lambda $.
The power parameter $ \lambda $ is estimated by a graphical technique or by the maximum-likelihood method. Unfortunately, a closed form for the estimator $ {\widehat \lambda } $ can be rarely found. Hence, the plot of the maximum likelihood against $ \lambda $ is helpful. The value of $ {\widehat \lambda } $ obtained in this way is treated as if it were a true value, and then one fits the model to the transformed data. Such an approach may be easily carried out, and an asymptotic theory associated with other parameters is useful. See [a1] and [a3].
This treatment has, however, some difficulties because $ {\widehat \lambda } $ has a variability and depends on the given data itself. It is known that estimation of $ \lambda $ by maximum likelihood and related likelihood-ratio tests can be heavily influenced by outliers (cf. also Outlier). Further, in certain situations, the usual limiting theory based on knowing $ \lambda $ does not hold in the unknown case. Therefore, several robust estimation procedures have been proposed (see Robust statistics; and [a5] and references therein).
In the literature, Box–Cox transformations are applied to basic distributions, e.g., the cubic root transformation of chi-squared variates is used for acceleration to normality (cf. also Normal distribution), and the square-root transformation stabilizes variances of Poisson distributions (cf. also Poisson distribution). These results are unified by appealing to features of the following family of distributions.
Consider a collection of densities of the form
$$ a ( x; \phi ) { \mathop{\rm exp} } \left [ { \frac{\theta x - \kappa _ \alpha ( \theta ) } \phi } \right ] $$
satisfying $ \kappa _ \alpha ^ {\prime \prime } ( \theta ) = \kappa _ \alpha ^ \prime ( \theta ) ^ {p} $ with $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $. This family is called an exponential dispersion model with power variance function (EDM-PVF) of index $ \alpha $. The existence of such a model was shown in [a2] unless $ \alpha > 2 $ or $ \alpha = 1 $. It is a flexible family, including the normal, Poisson, gamma-, inverse Gaussian, etc., distributions.
It is known that both of the normalizing and the variance-stabilizing transformations of the exponential dispersion model with power variance function are given by Box–Cox transformations, see [a4]. If $ Y $ follows the exponential dispersion model with power variance function and with index $ \alpha $, the normalizing and variance-stabilizing transformations are given by $ Y ^ {( q ) } $, respectively $ Y ^ {( r ) } $, where $ q $( the power for normalization) and $ r $( the power for variance-stabilization) are summarized in the Table below (recall that $ p = { {( 2 - \alpha ) } / {( 1 - \alpha ) } } $). The similar characteristics of familiar distributions are also tabulated there. For $ 0 < \alpha < 1 $, it has been proved in [a4] that the density of $ Y ^ {( q ) } $
has a uniformly convergent Gram–Charlier expansion (cf. also Gram–Charlier series). This implies that the normalizing transformation which is obtained by reducing the third-order cumulant reduces all higher-order cumulants as a result (cf. also Cumulant).
<tbody> </tbody>
|
Box–Cox transformations are also applied to link functions in generalized linear models. The transformations mainly aim to get the linearity of effects of covariates. See [a3] for further detail. Generalized Box–Cox transformations for random variables and link functions can be found in [a5].
See also Exponential distribution; Regression.
References
[a1] | G.E.P. Box, D.R. Cox, "An analysis of transformations" J. Roy. Statist. Soc. B , 26 (1964) pp. 211–252 |
[a2] | B. Jørgensen, "Exponential dispersion models" J. Roy. Statist. Soc. B , 49 (1987) pp. 127–162 |
[a3] | P. McCullagh, J.A. Nelder, "Generalized linear models" , Chapman and Hall (1990) (Edition: Second) |
[a4] | R. Nishii, "Convergence of the Gram–Charlier expansion after the normalizing Box–Cox transformation" Ann. Inst. Statist. Math. , 45 : 1 (1993) pp. 173–186 |
[a5] | G.A.F. Seber, C.J. Wild, "Nonlinear regression" , Wiley (1989) |
Box-Cox transformation. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Box-Cox_transformation&oldid=11455