iv_sort: Compute and sort variables in a dataset by information value

Description Usage Arguments Value

Description

This function will compute the information value for each variable in the supplied dataset. Variables must be pre-binned; any variable other than the response not of class "factor" will cause an error. Response should be numeric and binary

Usage

1
iv_sort(dat, response, var_grouping = NULL)

Arguments

dat

Dataset, a data.frame or dplyr::tbl_df containing pre-binned variables and binary response

response

string giving the name of the binary (0/1) response variable in the dataset

var_grouping

optional table giving the grouping structure of the variables. If provided, variables will be sorted by IV within the groups. Useful for selecting variables after performing some clustering procedure. Format: a tbl with 2 columns: var, the names of the variables and group, a number or string identifying groups

Value

a nrow(dat) x 2 tbl_df with two columns: var, giving each variable name, and iv, giving the Information Value The function will auto-merge pure bins, and return the information value obtained using the final merged bins.


awstringer/modellingTools documentation built on May 11, 2019, 4:11 p.m.