cum.bin: Monotonic binning based on maximum cumulative target rate...
In monobin: Monotonic Binning for Credit Rating Models

View source: R/04_CUM_BINNING.R

cum.bin

R Documentation

Monotonic binning based on maximum cumulative target rate (MAPA)

Description

cum.bin implements monotonic binning based on maximum cumulative target rate. This algorithm is known as MAPA (Monotone Adjacent Pooling Algorithm).

Usage

cum.bin(
  x,
  y,
  sc = c(NA, NaN, Inf, -Inf),
  sc.method = "together",
  g = 15,
  y.type = NA,
  force.trend = NA
)

Arguments

`x`	Numeric vector to be binned.
`y`	Numeric target vector (binary or continuous).
`sc`	Numeric vector with special case elements. Default values are `c(NA, NaN, Inf, -Inf)`. Recommendation is to keep the default values always and add new ones if needed. Otherwise, if these values exist in `x` and are not defined in the `sc` vector, function will report the error.
`sc.method`	Define how special cases will be treated, all together or in separate bins. Possible values are `"together", "separately"`.
`g`	Number of starting groups. Default is 15.
`y.type`	Type of `y`, possible options are `"bina"` (binary) and `"cont"` (continuous). If default value (`NA`) is passed, then algorithm will identify if `y` is 0/1 or continuous variable.
`force.trend`	If the expected trend should be forced. Possible values: `"i"` for increasing trend (`y` increases with increase of `x`), `"d"` for decreasing trend (`y` decreases with decrease of `x`). Default value is `NA`. If the default value is passed, then trend will be identified based on the sign of the Spearman correlation coefficient between `x` and `y` on complete cases.

Value

The command cum.bin generates a list of two objects. The first object, data frame summary.tbl presents a summary table of final binning, while x.trans is a vector of discretized values. In case of single unique value for x or y in complete cases (cases different than special cases), it will return data frame with info.

Examples

suppressMessages(library(monobin))
data(gcd)
amount.bin <- cum.bin(x = gcd$amount, y = gcd$qual)
amount.bin[[1]]
gcd$amount.bin <- amount.bin[[2]]
gcd %>% group_by(amount.bin) %>% summarise(n = n(), y.avg = mean(qual))
#increase default number of groups (g = 20)
amount.bin.1 <- cum.bin(x = gcd$amount, y = gcd$qual, g = 20)
amount.bin.1[[1]]
#force trend to decreasing
cum.bin(x = gcd$amount, y = gcd$qual, g = 20, force.trend = "d")[[1]]

monobin documentation built on July 21, 2022, 5:11 p.m.