ez.boxcox: box-cox power transformation GDoc Note

ez.boxcoxR Documentation

box-cox power transformation GDoc Note

Description

box-cox power transformation GDoc Note

Usage

ez.boxcox(
  y,
  col = NULL,
  na.rm = FALSE,
  plot = TRUE,
  print2scr = TRUE,
  force = TRUE,
  method = c("boxcox", "modified.tukey"),
  precise = c("rounded", "raw"),
  ...
)

Arguments

y

a data frame or a vector

col

passed to ez.selcol
if x is a data frame, col specified, process that col only.
if x is a data frame, col unspecified (i.e., NULL default), process all cols
if x is not a data frame, col is ignored
could be multiple cols

na.rm

rm na from y,x (pairwise), if not, NA stays as is. applicable only if y is a vector.

plot

boxcox plot. applicable only when there is an actual transformation

print2scr

print out transformation parameters

force

T = transform regardless, or F = only if p.lambda rounded is less than .05.

method

"boxcox" is out = car::bcPower(y, lambda=lambda.in.use, jacobian.adjusted = FALSE, gamma=NULL) for all positive, out = car::bcnPower(y, lambda=lambda.in.use, jacobian.adjusted = FALSE, gamma=gamma) for any non-positive–ie, zero or negative.

The selection between bcPower and bcnPower is done automatically by this function.

Where bcPower is: ((x+gamma)^(lambda)-1)/lambda if lambda not 0; log(x+gamma) if lambda 0. Here gamma NULL means 0.

bcnPower is: ((0.5 * (x + sqrt(x^2 + gamma^2)))^lambda - 1)/lambda if lambda not 0; log(0.5 * (x + sqrt(x^2 + gamma^2))) if lambda 0. This bcnPower is Hawkins and Weisberg (2017). While allowing for the transformed data to be interpreted similarly to the interpretation of Box-Cox transformed, it is much less biased than by setting the parameter gamma to be non-zero in the Box-Cox family.

"modified.tukey" out = car::basicPower(y,lambda=lambda.in.use, gamma=NULL); if (lambda.in.use<0) out = -1*out.

Where basicPower is: x^lambda if lambda not 0; log(x) if lambda 0

Because neither tukey or modified tukey could handle zero or negative input, this function will auto force switch to bcnPower
Therefore, both "tukey" and "boxcox" methods here keep the ordering.

precise

use rounded lambda, one of c(0, 0.33, -0.33, 0.5, -0.5, 1, -1, 2, -2) or raw/calculated lambda

Value

returns transformed y, or original y if no transformation occurs.

Note

Box and Cox (1964) bcPower and modified tukey basicPower can only deal with non-negative responses. Also consider applying z standardization to boxcox-transformed data.
lambda is a tuning parameter that can be optimized in a way that the distribution of the transformed data has the largest similarity to a normal distribution. There are several proposals to optimize lambda.
The Box-Cox-transformed values do not guarantee normality although the data should be less skewed and should have less extreme values than before transformation.
Some research (Zwiener et al, 2014, PLOS ONE) pointed out that z Standardization of covariates leads to better prediction performance independent of the underlying transformation used (eg., raw, log, boxcox)
see also boxcox


jerryzhujian9/zmisc documentation built on March 9, 2024, 12:49 a.m.