getBins: Bins variables to be used in logistic regression

Description Usage Arguments Value Examples

View source: R/logiBin.R

Description

This function uses parallel processing to compute bins for continuous and categorical variables. The splits are computed using the partykit package which uses conditional inferencing trees. Refer to the package documentation for more details. A separate bin is created for NA values. This can be combined using naCombine function. Categorical variables with a maximum of 10 distinct values are supported.

Usage

1
getBins(df, y, xVars, minProp = 0.03, minCr = 0.9, nCores = 1)

Arguments

df

- A data frame

y

- The name of the dependent variable

xVars

- A vector names of variables

minProp

- The minimum proportion of observations that must be exceeded in order to implement a split. Default value is 0.03

minCr

- The value of test statistic that must be exceeded in order to implement a split. Increasing this value will decrease the number of splits. Refer to the partykit package documentation for more details. Default value is 0.9

nCores

- The number of cores used for parallel processing. The default value is 1

Value

Returns a list containing 3 elements. The first is a data frame called varSummary which contains a summary of all the variables along with their IV value, entropy, p value from ctree function in partykit package, flag which indicates if bad rate increases/decreases with variable value, flag to indicate if a monotonic trend is present, number of bins which flip (i.e. do not follow a monotonic trend), number of bins of the variable and a flag to indicate whether it includes pure nodes (node which do not have any defaults). The second element is a data frame called bin which contains details of all the bins of the variables. The third element is a dataframe called err which contains details of all the variables that could not be split and the reason for the same.

Examples

1
b1 <- getBins(loanData, "bad_flag", c('age', 'score', 'balance'))

logiBin documentation built on May 2, 2019, 2:01 p.m.