optimal_bin: Optimally discretize variables in your training and test...

Description Usage Arguments Value

Description

This function will perform supervised discritization of all variables in the supplied training, and optionally test, datasets. This utilizes the binning functionality of modellingTools::simple_bin combined with the optimal bin calculations performed in smbinning::smbinning. Note that no filtering is done on the resulting binning structure; there may be pure bins, non-monotonic Weights of Evidience, etc. This is left to the user- the package provides a tool-set for dealing with any such concerns.

Usage

1
optimal_bin(train, response, exclude_vars = NULL, include_vars = NULL)

Arguments

train

training set

response

a string naming the response variable; must be 0/1 and coercible to factor

exclude_vars

variables to exclude (e.g. the target, or the row ID)

include_vars

if you only want certain variables binned, you may specify them directly instead of excluding all other variables

Value

a list containing the following elements: iv: a dataframe containing the variables and their information values, sorted in descending order train: a tbl_df containing the same variables as train, with the appropriate ones binned (per exclude_vars or include_vars) test: if test is NULL, then NULL; else a tbl_df containing the same variables as test, binned in the same manner as train.


awstringer/modellingTools documentation built on May 11, 2019, 4:11 p.m.