data_transformation: Tranforms Dependent Variables
In emdi: Estimating and Mapping Disaggregated Indicators

View source: R/transformation_functions.R

data_transformation

R Documentation

Tranforms Dependent Variables

Description

Function data_transformation transforms the dependent variable from the formula object fixed in the given sample data set. Thus, it returns the original sample data set with transformed dependent variable. For the transformation five types can be chosen, particularly no, natural log, Box-Cox, Dual and Log-Shift transformation.

Usage

data_transformation(fixed, smp_data, transformation, lambda)

Arguments

`fixed`	a two-sided linear formula object describing the fixed-effects part of the nested error linear regression model with the dependent variable on the left of a ~ operator and the explanatory variables on the right, separated by + operators. The argument corresponds to the argument `fixed` in function `lme`.
`smp_data`	a data frame that needs to comprise all variables named in `fixed`. If transformed data is further used to fit a nested error linear regression model, `smp_data` also needs to comprise the variable named in `smp_domains` (see `ebp`).
`transformation`	a character string. Five different transformation methods for the dependent variable can be chosen (i) no transformation ("no"); (ii) natural log transformation ("log"); (iii) Box-Cox transformation ("box.cox"); (iv) Dual transformation ("dual"); (v) Log-Shift transformation ("log.shift")..
`lambda`	a scalar parameter that determines the transformations with transformation parameter. In case of no and natural log transformation `lambda` can be set to NULL.

Details

For the natural log, Box-Cox and Dual transformation, the dependent variable is shifted such that all values are greater than zero since the transformations are not applicable for values equal to or smaller than zero. The shift is calculated as follows:

shift = |min(y)| + 1 \qquad if \qquad min(y) <= 0

Function data_transformation works as a wrapper function. This means that the function manages the selection of the three different transformation functions no_transform, log_transform and box_cox.

Value

a named list with two elements, a data frame containing the data set with transformed dependent variable (transformed_data) and a shift parameter shift if present. In case of no transformation, the original data frame is returned and the shift parameter is NULL.

Examples

# Loading data - sample data
data("eusilcA_smp")

# Transform dependent variable in sample data with Box-Cox transformation
transform_data <- data_transformation(eqIncome ~ gender + eqsize + cash +
  self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent +
  fam_allow + house_allow + cap_inv + tax_adj, eusilcA_smp, "box.cox", 0.7)

emdi documentation built on June 22, 2024, 9:46 a.m.