Description Usage Arguments Value Examples
This function 'scrubs' the data, it massages the data in checking and fixing missing data, transformations, scaling and outlier removal or imputation
1 2 | ScrubR(DT, predictors, rowWiseMissingPercentage, SDs, transformTable,
scalingMethod, imputationMethod)
|
DT |
This is a data.table object |
predictors |
These are the clean predictors provided manually or from the output of ScanR |
rowWiseMissingPercentage |
This the the allowed threshold of the percentage missing data per row |
SDs |
This the the allowed threshold of standard deviations away from the mean |
transformTable |
This is the table received from ScanR that provides information what features and how they need to be transformed |
scalingMethod |
This is the way the data will be scaled |
imputationMethod |
This is the way missing data will be treated |
a data.table with all the preprocessed data
1 2 3 4 5 6 7 8 9 10 11 12 13 | DT = data.table::data.table(plyr::baseball)
DT = LoadR(DT)
rs = scanr(DT, correlationCutoff = 0.99, SamplingPercentage=15, percentUniqueCutoff=5)
if (length(names(DT)) == length(unique(names(DT)))) {
data.table::setnames(DT, names(DT), rs$Predictors)
} else {
names(DT) = rs$Predictors
}
predictors = rs[Status=='Healthy' | Status=='Uniform' ,Predictors]
transformTable = rs[abs(Skewness)>abs(selectedTransformationBoundary),
c('Predictors', 'Skewness'), with=F]
results = scrubr(DT, predictors, rowWiseMissingPercentage=100, SDs=8,
transformTable=transformTable, scalingMethod='robustZscore', imputationMethod='CWD')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.