find_transformation_parameters: Set transformation parameters

View source: R/FindParameters.R

find_transformation_parametersR Documentation

Set transformation parameters

Description

find_transformation_parameters is used to find optimal parameters for univariate transformation to normality.

Usage

find_transformation_parameters(
  x,
  method = "yeo_johnson",
  robust = TRUE,
  invariant = TRUE,
  lambda = c(-4, 6),
  empirical_gof_normality_p_value = NULL,
  ...
)

Arguments

x

A vector with numeric values.

method

One of the following methods for power transformation:

  • box_cox: Transformation using the Box-Cox transformation (Box and Cox, 1964). The Box-Cox transformation requires that all data are strictly positive. Features that contain zero or negative values cannot be transformed using this transformation. In their work, Box and Cox define a shifted variant. We use this variant to shift values to a strictly positive range, when negative values are present. The Box-Cox transformation relies on a single parameter lambda, which is estimated through maximisation of the log-likelihood function corresponding to a normal distribution.

  • yeo_johnson:Transformation using the Yeo-Johnson transformation (Yeo and Johnson, 2000). Unlike the Box-Cox transformation, the Yeo-Johnson transformation allows for negative and positive values. Like the Box-Cox transformation, this transformation relies on a single parameter lambda, which is estimated through maximisation of the log-likelihood function corresponding to a normal distribution.

  • none: A fall-back method that will not transform values.

robust

Flag for using a robust version of Box-Cox or Yeo-Johnson transformation, as defined by Raymaekers and Rousseeuw (2021). This version is less sensitive in the presence outliers.

invariant

Flag for using a version of Box-Cox or Yeo-Johnson transformation that simultaneously optimises location and scale in addition to the lambda parameter.

lambda

Single lambda value, or range of lambda values that should be considered. Default: c(4.0, 6.0). Can be NULL to force optimisation without a constraint in lambda values.

empirical_gof_normality_p_value

Significance value for the empirical goodness-of-fit test for central normality. The p-value is computed through the assess_transformation function. By setting this parameter to a numeric value other than NULL, the transformation will be rejected when the p-value of the test is below the significance value.

...

Unused parameters.

Value

A transformer object that can be used to transform values.

References

  1. Yeo, I. & Johnson, R. A. A new family of power transformations to improve normality or symmetry. Biometrika 87, 954–959 (2000).

  2. Box, G. E. P. & Cox, D. R. An analysis of transformations. J. R. Stat. Soc. Series B Stat. Methodol. 26, 211–252 (1964).

  3. Raymaekers, J., Rousseeuw, P. J. Transforming variables to central normality. Mach Learn. (2021).

Examples

x <- exp(stats::rnorm(1000))
transformer <- find_transformation_parameters(
  x = x,
  method = "box_cox")

power.transform documentation built on April 12, 2025, 5:08 p.m.