cv_filter: Variable reduction based on Cramer's V filter

Description Usage Arguments Value Author(s) Examples

View source: R/functions.R

Description

The function returns a list of variables that can be dropped because of high correlation with another variable, based on Cramer's V and IV. If V1 and V2 have a Cramer's V value more than a user defined threshold, the variable with lower IV will be recommended to be dropped by this function. The variable which got dropped wont be considered for dropping any more variables.

Usage

1

Arguments

cv_table

dataframe of class cv_table with three columns - var_1, var_2, cv_value

iv_table

dataframe of class iv_table with two columns - Variable_name, iv

threshold

Cramers' V value above which one of the variable will be recommended to be dropped

Value

An object of class "cv_filter" is a list containing the following components:

retain_var_list

list of variables remaining post CV filter

dropped_var_list

list of variables that can be dropped based on CV filter

dropped_var_tab

CV correlation value for dropped variables as a dataframe

threshold

threshold CV value used as input parameter

Author(s)

Arya Poddar <aryapoddar290990@gmail.com>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data <- iris
suppressWarnings(RNGversion('3.5.0'))
set.seed(11)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
cv_tab_list <- cv_table(data, c("Species", "Sepal.Length"))
cv_tab <- cv_tab_list$cv_val_tab
x <- c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width")
iv_table_list <- iv_table(base = data,target = "Y",num_var_name = x,cat_var_name = "Species")
iv_tab <- iv_table_list$iv_table
cv_filter_list <- cv_filter(cv_table = cv_tab,iv_table = iv_tab,threshold = 0.5)
cv_filter_list$retain_var_list
cv_filter_list$dropped_var_list
cv_filter_list$dropped_var_tab
cv_filter_list$threshold

Example output

[1] "Species"
[1] "Sepal.Length"
   dropped_var   var_2  cv_value dropped_var_iv      iv_2
1 Sepal.Length Species 0.7217263      0.1078267 0.1208137
[1] 0.5

scorecardModelUtils documentation built on May 2, 2019, 9:59 a.m.