suggest_transformation: Plot Skewness and Kurtosis and suggest transformations

Description Usage Arguments Value Examples

View source: R/code.R

Description

suggest_transformation Evaluate the normality of the data by calculating skew and kurtosis. The user is provided guidance regarding whether data tranformation is advised, and how well each of three tranformation options (Box-Cox, Yeo-Johnson, or PCA) does in reducing non-normality as compared to each other and the original, untransformed data set.

Usage

1

Arguments

x

A dataframe of input variables.

Value

Four plots depicting skew and kurtosis of each variable. The four plots are (1) the original, untransformed data, (2) transformed data applying Box-Cox (tagged "tf1"), (3) tranformed data applying Yeo-Johnson (tagged applying "tf2"), and (4) transformed data using Principal Components (tagged "tf3"). Additionally, the three transformed data sets generated and ready to be called with other functions using the tags.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
library(mlbench)
data(PimaIndiansDiabetes)
index <- sample(seq_len(nrow(PimaIndiansDiabetes)), 500)
trainingSet <- PimaIndiansDiabetes[index, ]
testSet <- PimaIndiansDiabetes[-index, ]
x <- trainingSet[, -9]
y <- trainingSet[, 9]
x_test <- testSet[, -9]
y_test <- testSet[, 9]
suggest_transformation(x)

# vignette("modeval") #check a vignette for further details

## End(Not run)

Example output

 Consider transforming data if skew or kurtosis of any variable is > 2 or < -2 

 BoxCox      : The distribution of an attribute can be shifted to reduce the skew and make it more Gaussian. 

 Yeo-Johnson : Like the Box-Cox transform, but it supports raw values that are equal to zero and negative. 

 PCA         : Transform the data to the principal components. 

modeval documentation built on May 29, 2017, 10:54 a.m.