treatment_corr | R Documentation |
The treatment_corr() diagnose pairs of highly correlated variables or remove on of them.
treatment_corr(.data, corr_thres = 0.8, treat = TRUE, verbose = TRUE)
.data |
a data.frame or a |
corr_thres |
numeric. Set a threshold to detecting variables when correlation greater then threshold. |
treat |
logical. Set whether to removing variables |
verbose |
logical. Set whether to echo information to the console at runtime. |
The correlation coefficient of pearson is obtained for continuous variables and the correlation coefficient of spearman for categorical variables.
An object of data.frame or train_df. and return value is an object of the same type as the .data argument. However, several variables can be excluded by correlation between variables.
# numerical variable x1 <- 1:100 set.seed(12L) x2 <- sample(1:3, size = 100, replace = TRUE) * x1 + rnorm(1) set.seed(1234L) x3 <- sample(1:2, size = 100, replace = TRUE) * x1 + rnorm(1) # categorical variable x4 <- factor(rep(letters[1:20], time = 5)) set.seed(100L) x5 <- factor(rep(letters[1:20 + sample(1:6, size = 20, replace = TRUE)], time = 5)) set.seed(200L) x6 <- factor(rep(letters[1:20 + sample(1:3, size = 20, replace = TRUE)], time = 5)) set.seed(300L) x7 <- factor(sample(letters[1:5], size = 100, replace = TRUE)) exam <- data.frame(x1, x2, x3, x4, x5, x6, x7) str(exam) head(exam) # default case treatment_corr(exam) # not removing variables treatment_corr(exam, treat = FALSE) # Set a threshold to detecting variables when correlation greater then 0.9 treatment_corr(exam, corr_thres = 0.9, treat = FALSE) # not verbose mode treatment_corr(exam, verbose = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.