cleanse.split_df | R Documentation |
Diagnosis of similarity between datasets splitted by train set and set included in the "split_df" class. and cleansing the "split_df" class
## S3 method for class 'split_df'
cleanse(.data, add_character = FALSE, uniq_thres = 0.9, missing = FALSE, ...)
.data |
an object of class "split_df", usually, a result of a call to split_df(). |
add_character |
logical. Decide whether to include text variables in the compare of categorical data. The default value is FALSE, which also not includes character variables. |
uniq_thres |
numeric. Set a threshold to removing variables when the ratio of unique values(number of unique values / number of observation) is greater than the set value. |
missing |
logical. Set whether to removing variables including missing value |
... |
further arguments passed to or from other methods. |
Remove the detected variables from the diagnosis using the compare_diag() function.
An object of class "split_df".
library(dplyr)
# Credit Card Default Data
head(ISLR::Default)
# Generate data for the example
sb <- ISLR::Default %>%
split_by(default)
sb %>%
cleanse
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.