umx_yj_wide_twin_data: Yeo-Johnson transform wide twin data (Non-destructive)

umx_yj_wide_twin_dataR Documentation

Yeo-Johnson transform wide twin data (Non-destructive)

Description

umx_yj_wide_twin_data applies the Yeo-Johnson transformation to wide twin data. It "stacks" the data across twins (T1 and T2) to estimate a single optimal Maximum Likelihood lambda parameter. This ensures that the transformation is identical for both twins, preserving the twin covariance structure.

Usage

umx_yj_wide_twin_data(
  data,
  varsToTransform,
  sep = "_T",
  twins = 1:2,
  suffix = "_yj",
  verbose = TRUE
)

Arguments

data

A wide dataframe

varsToTransform

The base names of the variables (e.g. "CAQ")

sep

The separator (e.g. "_T")

twins

Suffixes for twins (default 1:2)

suffix

The suffix for the new transformed columns (default "_yj")

verbose

Whether to print parameters and plot distributions (default TRUE)

Details

The Yeo-Johnson transformation is a power transform that handles zero and negative values natively. It is often superior to log(x+1) because it uses MLE to find the mathematically optimal power to minimize skewness.

When verbose = TRUE, the function reports the lambda value and provides a diagnostic plot comparing the raw and transformed distributions.

Value

  • dataframe with original and new transformed variables

References

  • Yeo, I. K., & Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954-959.

  • Cragg, J. G. (1971). Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods. Econometrica, 39(5), 829-844.

See Also

Other Twin Data functions: umx, umx_long2wide(), umx_make_TwinData(), umx_make_twin_data_nice(), umx_residualize(), umx_scale_wide_twin_data(), umx_wide2longTwinData()

Examples

# df = umx_yj_wide_twin_data(data = df, varsToTransform = c("CAQ"), sep = "_T")

umx documentation built on May 18, 2026, 5:07 p.m.