Box and Cox (1964) considered the following family
of transformations indexed by lambda
:
w
= (y^lambda1)/lambda
= expm1(lambda*log(y))/lambda
,
with the lambda=0
case defined as
log(y)
to make w
continuous in
lambda
for constant y
.
They estimate lambda
assuming w
follows a normal distribution. This raises a
theoretical problem in that y
must be
positive, which means that w
must follow
a truncated normal distribution conditioned on
lambda*w
> (1)
.
Bickel and Doksum (1981) removed the restriction
to positive y
, i.e., to
w
> (1/lambda)
by modifying
the transformaton as follows:
w
=
(sgn(y)*abs(y)^lambda1)/lambda
if lambda != 0
and
sgn(y)*log(abs(y))
if lambda = 0
,
where sgn(y) = 1 if y >= 0 and 1 otherwise.
NOTE: sgn(y)
is different from
link[base]{sign}
(y), which is 0 for y = 0.
A twoargument update to the sign function
in the base package has been added to
this Ecfun package, so sign
(y, 1)
= sgn(y).
If (y<0), this transformation is discontinuous
at lambda = 0
. To see this, we rewrite
this as
w
=
(sgn(y)*expm1(lambda*log(abs(y))) +
(sgn(y)1)) / lambda
= sgn(y)*(log(abs(y)) + O(lambda) +
(sgn(y)1)/lambda
,
where O(lambda) indicates a term that is dominated by a constant times lambda.
If y<0, this latter term (sgn(y)1)/lambda = (2)/lambda and becomes Inf as lambda > 0.
In practice, we assume that y
> 0, so this
distinction has little practical value. However,
the BoxCox
function computes the
BickelDoksum version.
Box and Cox further noted that proper estimation
of lambda
should include the Jacobian of
the transformation in the log(likelihood).
Doing this can be achieved by rescaling the
transformation with the n
th root of
the Jacobian, which can be written as follows:
j(y, lambda)
=
J(y, lambda)^(1/n)
=
GeometricMean(y)^(lambda1)
.
With this the rescaled power transformation is as follows:
z
= (y^lambda1)/(lambda*j(y, lambda)
if lambda!=0
or GeometricMean(y)*log(y)
if lambda==0
.
In addition to facilitating estimation of
lambda
, rescaling has the advantage that
the units of z
are the same as the units
of y
.
The ouput has class 'BoxCox', which has
attributes that allow the input to be recovered
using invBoxCox
. The default values of
the arguments of invBoxCox
are provided
by the corresponding attributes
of z
.
1 2 
y 
a numeric vector for which the power transform is desired 
lambda 
A numeric vector of length 1 or 2. The first
component is the power. If the second
component is provided, 
rescale 
logical or numeric. If logical: For For If numeric, it is assumed to be the geometric mean of another set of y values to use with new y's. 
na.rm 
logical:
NOTE: If 
z 
a numeric vector or an object of class

sign.y 
an optional logical vector giving

GeometricMean 
an optional numeric scalar giving the
geometric mean of the data values that
presumably generated 
Box and Cox (1964) discussed
w(y, lambda) = (y^lambda  1)/lambda
.
They noted that w
is continuous in
lambda
with w(y, lambda) = log(y)
if lambda
= 0 (by l'Hopital's rule).
They also discussed
z(y, lambda) = (y^lambda  1)/(lambda*g^(lambda1))
,
where g
= the geometric mean of y
.
They noted that proper estimation of
lambda
should include the Jacobian of
w(y, lambda) with the likelihood. They further
showed that a naive normal likelihood using
z(y, lambda)
as the response without a
Jacobian is equivalent to the normal likelihood
using w(y, lambda)
adjusted appropriately
using the Jacobian. See Box and Cox (1964) or
the Wikipedia article on "Power transform".
Bickel and Doksum (1981) suggested adding
sign(y)
to the transformation, as
discussed above.
NUMERICAL ANALYSIS:
Consider the Bickel and Doksum version described above:
w
< (sign(y)*abs(y)^lambda1)/lambda
if(any(y==0)), GeometricMean(y) = 0. This creates a problem with the above math.
Let ly = log(abs(y)). Then with la = lambda,
w
= code(sign(y)*exp(la*ly)1)/la
= sign(y)*ly(1+(la*ly/2)*(1+(la*ly/3)*(1+(la*ly/4)*(1+O(la*ly)))))
+ (sign(y)1)/la
For y>0, the last term is zero.
boxcox
ignores cases with
y<=0 and uses this formula (ignoring the
final O(la*ly)) whenever abs(la) <= eps = 1/50.
That form is used here also.
For invBoxCox
a complementary analysis is
as follows:
abs(y+lambda[2]) = abs(1+la*w)^(1/la)
= exp(log1p(la*w)/la) for abs(la*w)<1
= w*(1la*w*((1/2)la*w*((1/3)la*w*(1/4...))))
BoxCox
returns an object of class
BoxCox
, being a numeric vector of the
same length as y
with the following
optional attributes:
lambdathe value of lambda used in the transformation
sign.y sign(y) (or sign(ylambda[2]) lambda[2] is provided and if any of these quantities are negative. Otherwise, this is omitted and all are assumed to be positive.
rescale
logical:
TRUE
if z(y, lambda)
is returned rescaled by g^(lambda1)
with g = the geometric mean of y
and FALSE
if z(y, lambda)
is not
so rescaled.
GeometricMean
If rescale
is numeric,
attr(., 'GeometricMean') < rescale
.
Otherwise, attr(., 'GeometricMean')
is
the Geometric mean of abs(y) =
exp(mean(log(abs(y)))) or of
abs(y+lambda[2]) if(length(lambda)>1).
invBoxCox
returns a numeric vector,
reconstructing y
from
BoxCox(y, ...)
.
Bickel, Peter J., and Doksum, Kjell A. (1981) "An analysis of transformation revisited", Journal of the American Statistical Association, 76 (374): 296311
Box, George E. P.; Cox, D. R. (1964). "An analysis of transformations", Journal of the Royal Statistical Society, Series B 26 (2): 211252.
Box, George E. P.; Cox, D. R. (1982). "An analysis of transformations revisited, rebutted", Journal of the American Statistical Association, 77(377): 209210.
boxcox
in the MASS package
quine
in the MASS package for data used in an example below.
boxcox
and
boxcoxCensored
in the
EnvStats package.
boxcox.drc
in the drc package.
boxCox
in the car package.
These other uses all wrap the BoxCox transformation in something larger and do not give the transformation itself directly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96  ##
## 1. A simple example to check the two algorithms
##
Days < 0:9
bc1 < BoxCox(Days, c(0.01, 1))
# Taylor expansion used for obs 1:7; expm1 for 8:10
# check
GM < exp(mean(log(abs(Days+1))))
bc0 < (((Days+1)^0.01)1)/0.01
bc1. < (bc0 / (GM^(0.011)))
# log(Days+1) ranges from 0 to 4.4
# lambda = 0.01 will invoke both the obvious
# algorithm and the alternative assumed to be
# more accurate for (lambda(log(y)) < 0.02).
attr(bc1., 'lambda') < c(0.01, 1)
attr(bc1., 'rescale') < TRUE
attr(bc1., 'GeometricMean') < GM
class(bc1.) < 'BoxCox'
all.equal(bc1, bc1.)
##
## 2. The "boxcox" function in the MASS package
## computes a maximum likelihood estimate with
## BoxCox(Days+1, lambda=0.21)
## with a 95 percent confidence interval of
## approximately (0.08, 0.35)
##
bcDays1 < BoxCox(MASS::quine$Days, c(0.21, 1))
# check
GeoMean < exp(mean(log(abs(MASS::quine$Days+1))))
bcDays1. < ((((MASS::quine$Days+1)^0.21)1) /
(0.21*GeoMean^(0.211)))
# log(Days+1) ranges from 0 to 4.4
attr(bcDays1., 'lambda') < c(0.21, 1)
attr(bcDays1., 'rescale') < TRUE
attr(bcDays1., 'GeometricMean') < GeoMean
class(bcDays1.) < 'BoxCox'
all.equal(bcDays1, bcDays1.)
iDays < invBoxCox(bcDays1)
all.equal(iDays, MASS::quine$Days)
##
## 3. Easily computed example
##
bc2 < BoxCox(c(1, 4), 2)
# check
bc2. < (c(1, 4)^21)/4
attr(bc2., 'lambda') < 2
attr(bc2., 'rescale') < TRUE
attr(bc2., 'GeometricMean') < 2
class(bc2.) < 'BoxCox'
all.equal(bc2, bc2.)
all.equal(invBoxCox(bc2), c(1, 4))
##
## 4. plot(BoxCox())
##
y0 < seq(2, 2, .1)
z2 < BoxCox(y0, 2, rescale=FALSE)
plot(y0, z2)
# check
z2. < (sign(y0)*y0^21)/2
attr(z2., 'lambda') < 2
attr(z2., 'sign.y') < sign(y0, 1)
attr(z2., 'rescale') < FALSE
attr(z2., 'GeometricMean') < 0
class(z2.) < 'BoxCox'
all.equal(z2, z2.)
all.equal(invBoxCox(z2), y0)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.