ScrubR: ScrubR'

Description Usage Arguments Value Examples

View source: R/functions.R

Description

This function 'scrubs' the data, it massages the data in checking and fixing missing data, transformations, scaling and outlier removal or imputation

Usage

1
2
ScrubR(DT, predictors, rowWiseMissingPercentage, SDs, transformTable,
  scalingMethod, imputationMethod)

Arguments

DT

This is a data.table object

predictors

These are the clean predictors provided manually or from the output of ScanR

rowWiseMissingPercentage

This the the allowed threshold of the percentage missing data per row

SDs

This the the allowed threshold of standard deviations away from the mean

transformTable

This is the table received from ScanR that provides information what features and how they need to be transformed

scalingMethod

This is the way the data will be scaled

imputationMethod

This is the way missing data will be treated

Value

a data.table with all the preprocessed data

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
DT = data.table::data.table(plyr::baseball)
DT = LoadR(DT)
rs = scanr(DT, correlationCutoff = 0.99, SamplingPercentage=15, percentUniqueCutoff=5)
if (length(names(DT)) == length(unique(names(DT)))) {
data.table::setnames(DT, names(DT), rs$Predictors)
} else {
	names(DT) = rs$Predictors
}
predictors = rs[Status=='Healthy' | Status=='Uniform' ,Predictors]
transformTable = rs[abs(Skewness)>abs(selectedTransformationBoundary),
c('Predictors', 'Skewness'), with=F]
results = scrubr(DT, predictors, rowWiseMissingPercentage=100, SDs=8,
transformTable=transformTable, scalingMethod='robustZscore', imputationMethod='CWD')

womta/PurifyR documentation built on May 21, 2019, 11:11 a.m.