Description Usage Arguments Examples
View source: R/optLogTransform.R
This function finds the optimal transformation for normalization of each of the variables and outputs a matrix of the transformed data. The output is a list containing five objects:
a string vector listing the names of each variable,
a string vector listing the function used to transform each variable,
a numeric vector giving the skew of each transformed variable,
a numeric vector giving the optimal transform value for each variable,
and a matrix of the transformed data.
1 2 3 | optLogTransform(mydata, type = "log", skew_thresh = 1,
n_trans_val = 50, scaled = T, retain_domain = F,
hist_raw_folder = NA, hist_trans_folder = NA, skew_folder = NA)
|
mydata |
The dataset you would like to transform. Must be in vector or matrix form. If given a matrix, the function will transform each column seprately. Works best if columns are named, particularly if you are exporting plots. |
type |
The type of transformation can be either logarithmic or power; "log" and "power" respectively. |
skew_thresh |
The threshold skew value required for transformation. If the skew of the variable is less than skew_thresh, it will be considered normal and will not be transformed. |
n_trans_val |
The number of gridpoints representing different strengths of transformation we want to test for getting the most normal curve. The higher this number is, the better the normalization. However, higher numbers can significantly increase computation time. |
scaled |
If set to TRUE, the resulting transformation will have zero mean and unit variance. |
retain_domain |
Set to TRUE if you would like the transformed data to have the same domain as the original dataset (not recommended). |
hist_raw_folder |
The name of the folder where you would like to save a histogram showing the distribution of the raw data. If you do not wish to save these plots, set to NA. |
hist_trans_folder |
The name of the folder where you would like to save a histogram showing the distribution of the transformed data. If you do not wish to save these plots, set to NA. |
skew_folder |
The name of the folder where you would like to save a plot showing the optimal skew with respect to the transformation variable. If you do not wish to save these plots, set to NA. |
1 2 3 4 5 6 7 8 9 10 11 12 13 | library(optLog)
# First generate a random normal dataset.
mydata <- rnorm(100, mean = 0, sd = 1)
hist(mydata)
# Add skew to the dataset.
mydata_skew <- cbind(1-(1-mydata)^2, (1-mydata)^2, mydata)
colnames(mydata_skew) <- c("Variable 1", "Variable 2", "Variable 3")
for(i in 1:3){hist(mydata_skew[,i])}
# Use optLogTransform to remove the skew.
mydata_transformed <- optLogTransform(mydata_skew, type = "power", scaled = FALSE)
for(i in 1:3){hist(mydata_transformed$data[,i])}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.