ez.boxcox: box-cox power transformation GDoc Note

ez.boxcoxR Documentation

box-cox power transformation GDoc Note


box-cox power transformation GDoc Note


  col = NULL,
  na.rm = FALSE,
  plot = TRUE,
  print2scr = TRUE,
  force = TRUE,
  method = c("boxcox", "modified.tukey"),
  precise = c("rounded", "raw"),



a data frame or a vector


passed to ez.selcol
if x is a data frame, col specified, process that col only.
if x is a data frame, col unspecified (i.e., NULL default), process all cols
if x is not a data frame, col is ignored
could be multiple cols


rm na from y,x (pairwise), if not, NA stays as is. applicable only if y is a vector.


boxcox plot. applicable only when there is an actual transformation


print out transformation parameters


T = transform regardless, or F = only if p.lambda rounded is less than .05.


"boxcox" is out = car::bcPower(y, lambda=lambda.in.use, jacobian.adjusted = FALSE, gamma=NULL) for all positive, out = car::bcnPower(y, lambda=lambda.in.use, jacobian.adjusted = FALSE, gamma=gamma) for any non-positive–ie, zero or negative.

The selection between bcPower and bcnPower is done automatically by this function.

Where bcPower is: ((x+gamma)^(lambda)-1)/lambda if lambda not 0; log(x+gamma) if lambda 0. Here gamma NULL means 0.

bcnPower is: ((0.5 * (x + sqrt(x^2 + gamma^2)))^lambda - 1)/lambda if lambda not 0; log(0.5 * (x + sqrt(x^2 + gamma^2))) if lambda 0. This bcnPower is Hawkins and Weisberg (2017). While allowing for the transformed data to be interpreted similarly to the interpretation of Box-Cox transformed, it is much less biased than by setting the parameter gamma to be non-zero in the Box-Cox family.

"modified.tukey" out = car::basicPower(y,lambda=lambda.in.use, gamma=NULL); if (lambda.in.use<0) out = -1*out.

Where basicPower is: x^lambda if lambda not 0; log(x) if lambda 0

Because neither tukey or modified tukey could handle zero or negative input, this function will auto force switch to bcnPower
Therefore, both "tukey" and "boxcox" methods here keep the ordering.


use rounded lambda, one of c(0, 0.33, -0.33, 0.5, -0.5, 1, -1, 2, -2) or raw/calculated lambda


returns transformed y, or original y if no transformation occurs.


Box and Cox (1964) bcPower and modified tukey basicPower can only deal with non-negative responses. Also consider applying z standardization to boxcox-transformed data.
lambda is a tuning parameter that can be optimized in a way that the distribution of the transformed data has the largest similarity to a normal distribution. There are several proposals to optimize lambda.
The Box-Cox-transformed values do not guarantee normality although the data should be less skewed and should have less extreme values than before transformation.
Some research (Zwiener et al, 2014, PLOS ONE) pointed out that z Standardization of covariates leads to better prediction performance independent of the underlying transformation used (eg., raw, log, boxcox)
see also boxcox

jerryzhujian9/ezmisc documentation built on March 9, 2024, 12:44 a.m.