Description Usage Arguments Value Author(s) Examples
The function returns a list of variables that can be dropped because of high correlation with another variable, based on Cramer's V and IV. If V1 and V2 have a Cramer's V value more than a user defined threshold, the variable with lower IV will be recommended to be dropped by this function. The variable which got dropped wont be considered for dropping any more variables.
1 |
cv_table |
dataframe of class cv_table with three columns - var_1, var_2, cv_value |
iv_table |
dataframe of class iv_table with two columns - Variable_name, iv |
threshold |
Cramers' V value above which one of the variable will be recommended to be dropped |
An object of class "cv_filter" is a list containing the following components:
retain_var_list |
list of variables remaining post CV filter |
dropped_var_list |
list of variables that can be dropped based on CV filter |
dropped_var_tab |
CV correlation value for dropped variables as a dataframe |
threshold |
threshold CV value used as input parameter |
Arya Poddar <aryapoddar290990@gmail.com>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | data <- iris
suppressWarnings(RNGversion('3.5.0'))
set.seed(11)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
cv_tab_list <- cv_table(data, c("Species", "Sepal.Length"))
cv_tab <- cv_tab_list$cv_val_tab
x <- c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width")
iv_table_list <- iv_table(base = data,target = "Y",num_var_name = x,cat_var_name = "Species")
iv_tab <- iv_table_list$iv_table
cv_filter_list <- cv_filter(cv_table = cv_tab,iv_table = iv_tab,threshold = 0.5)
cv_filter_list$retain_var_list
cv_filter_list$dropped_var_list
cv_filter_list$dropped_var_tab
cv_filter_list$threshold
|
[1] "Species"
[1] "Sepal.Length"
dropped_var var_2 cv_value dropped_var_iv iv_2
1 Sepal.Length Species 0.7217263 0.1078267 0.1208137
[1] 0.5
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.