sc.binning: Numeric binning for Credit Scoring

Description Usage Arguments Value Examples

Description

This is an easy numerical binning solution for credit scorecard build. It is designed to choose the optimal binning solution by utilizing the Recursive Partitioning. This will only bin numeric or integer variables and ignore factor or character variables.

Usage

1
2
sc.binning(data, target, n = 10, p = 3, thres = 0.5,
  freqCut = 95/5, uniqueCut = 10, best = TRUE, parallel = FALSE)

Arguments

data

A data frame which contains target varible as well as predictor variables.

target

Target variable name.

n

Number of bootstrap iterations. Default 10 times.

p

The minimum percentage of observation per bin. Default 3%.

thres

Threshold differences of target between bins. Default 0.5%.

freqCut

Utilizing nearZeroVar function. The cutoff for the ratio of the most common value to the second most common value. Default 95/5.

uniqueCut

Utilizing nearZeroVar function. The cutoff for the percentage of distinct values out of the number of total samples. Default 10%.

best

A logical scalar. Use different methods which maximize IV. Default TRUE.

parallel

A logical scalar. Use parallel backend. Default FALSE.

Value

The output is a list of cut plan which can be applied to the orginal data frame via the predict function. The user can also update the cut plan via the update function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
# Load library
library(easysc)

# Generate a cut plan which maximize IV via 500 bootstrap resampling
cut.plan <- sc.binning(data = df, target = BAD, n = 500, p = 5, best = TRUE, parallel = TRUE)
# Update the cut plan
update(cut.plan, AGE = c(20, 30, 40))
# Apply to the data frame
predict(cut.plan, df, keepTarget = TRUE)

## End(Not run)

lmtleminh/easysc documentation built on July 5, 2019, 11:48 a.m.