transfo | R Documentation |
This function uses reweighted maximum likelihood to robustly fit the
Box-Cox or Yeo-Johnson transformation to each variable in a dataset.
Note that this function first calls checkDataSet
to ensure that the variables to be transformed are not too discrete.
transfo(X, type = "YJ", robust = TRUE,
standardize = TRUE,
quant = 0.99, nbsteps = 2, checkPars = list())
X |
A data matrix of dimensions n x d. Its columns are the variables to be transformed. |
type |
The type of transformation to be fit. Should be one of:
|
robust |
if |
standardize |
whether to standardize the variables before and after the power transformation. See Details below. |
quant |
quantile for determining the weights in the
reweighting step (ignored when |
nbsteps |
number of reweighting steps (ignored when
|
checkPars |
Optional list of parameters used in the call to
|
In case standardize = TRUE
, the variables is standardized before and after transformation.
For BC the variable is divided by its median before transformation.
For YJ and robust = TRUE
this subtracts its median and divides by its mad (median absolute deviation) before transformation. For YJ and robust = FALSE
this subtracts the mean and divides by the standard deviation before transformation. For the standardization after the transformation, the classical mean and standard deviation are used in case robust = FALSE
. If robust = TRUE
, the mean and standard deviation are calculated robustly on a subset of inliers.
A list with components:
lambdahats
the estimated transformation parameter for each column of X
.
Y
A matrix in which each column is the transformed version of the
corresponding column of X
.
The transformed version includes pre- and post-standardization if standardize=TRUE
.
muhat
The estimated location of each column of Y
.
sigmahat
The estimated scale of each column of Y
.
weights
The final weights from the reweighting.
ttypes
The type of transform used in each column.
objective
Value of the (reweighted) maximum likelihood objective function.
values of checkDataSet
, unless coreOnly
is TRUE
.
J. Raymaekers and P.J. Rousseeuw
J. Raymaekers and P.J. Rousseeuw (2021). Transforming variables to central normality. Machine Learning. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s10994-021-05960-5")}(link to open access pdf)
transfo_newdata
, transfo_transformback
# find Box-Cox transformation parameter for lognormal data:
set.seed(123)
x <- exp(rnorm(1000))
transfo.out <- transfo(x, type = "BC")
# estimated parameter:
transfo.out$lambdahat
# value of the objective function:
transfo.out$objective
# the transformed variable:
transfo.out$Y
# the type of transformation used:
transfo.out$ttypes
# qqplot of the transformed variable:
qqnorm(transfo.out$Y); abline(0,1)
# For more examples, we refer to the vignette:
## Not run:
vignette("transfo_examples")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.