Description Usage Arguments Details Value See Also Examples
Sequentially reduce the number of bins (from a list of bins) based on similarity in terms of proportions of non-default/default status between adjacent bins.
1 2 3 4 5 6 | reduce_bins(
list_of_bins = NULL,
min_required_bins = NULL,
confidence_level = NULL,
test_type = "chisq.test"
)
|
list_of_bins |
A list of bins. |
min_required_bins |
An integer (minimum two). The minimum number of bins in the returned list. |
confidence_level |
A double between 0 and 1 representing the confidence level passed onto the homogeneity test. |
test_type |
The type of homogeneity test,
|
Similarity, or homogeneity, is assessed by performing a test of independence. The list of bins are reduced by merging the most similar pair of adjacent bins. The function terminates when a minimum number of required of bins are obtained or when all the bins are statistically different (heterogeneous) at the given level of confidence.
The returned list of bins is not guaranteed to exhibit a monotonic
development of default rates, but it is likely that the sequential reduction
of bins will mitigate the problem. Should monotonicity be required, see
merge_list_of_bins
on how to impose a manual binning approach
as a final step.
A list of bins. Each list component in the returned list is a bin (of class bin).
See create_initial_bins
on how to create the initial
bins, merge_list_of_bins
for manual binning, and
autobin
for automatic binning.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | # example of interactive binning
# create initial bins
bins <- create_initial_bins(bin_data, 30, "score", "default")
length(bins)
is_monotonic(bins)
# reduce bins by performing repeated homogeneity tests
new_bins <- reduce_bins(bins, min_required_bins = 7, confidence_level = 0.01)
length(new_bins)
is_monotonic(new_bins)
# plot initial and reduced bins
bins_df <- dplyr::bind_rows(bins)
plot(x = bins_df$mid_score, y = log(bins_df$odds), type = "p",
col = "lightblue", cex = 1.5, pch = 20, ylab = "log(odds)",
xlab = "score")
new_bins_df <- dplyr::bind_rows(new_bins)
points(x = new_bins_df$mid_score, y = log(new_bins_df$odds),
col = "darkblue", cex = 1.5, pch = 20)
legend(x = "topright", legend = c("Initial bins", "Reduced bins"),
col = c("lightblue", "darkblue"), pch = 20, pt.cex = 1.5, bty = "n")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.