View source: R/05_STS_BINNING.R
sts.bin | R Documentation |
sts.bin
implements extension of the three-stage monotonic binning procedure (iso.bin
)
with final step of iterative merging of adjacent bins based on
statistical test.
sts.bin( x, y, sc = c(NA, NaN, Inf, -Inf), sc.method = "together", y.type = NA, min.pct.obs = 0.05, min.avg.rate = 0.01, p.val = 0.05, force.trend = NA )
x |
Numeric vector to be binned. |
y |
Numeric target vector (binary or continuous). |
sc |
Numeric vector with special case elements. Default values are |
sc.method |
Define how special cases will be treated, all together or in separate bins.
Possible values are |
y.type |
Type of |
min.pct.obs |
Minimum percentage of observations per bin. Default is 0.05 or minimum 30 observations. |
min.avg.rate |
Minimum |
p.val |
Threshold for p-value of statistical test. Default is 0.05. For binary target test of two proportion is applied, while for continuous two samples independent t-test. |
force.trend |
If the expected trend should be forced. Possible values: |
The command sts.bin
generates a list of two objects. The first object, data frame summary.tbl
presents a summary table of final binning, while x.trans
is a vector of discretized values.
In case of single unique value for x
or y
of complete cases (cases different than special cases),
it will return data frame with info.
iso.bin
for three-stage monotonic binning procedure.
suppressMessages(library(monobin)) data(gcd) #binary target maturity.bin <- sts.bin(x = gcd$maturity, y = gcd$qual) maturity.bin[[1]] tapply(gcd$qual, maturity.bin[[2]], function(x) c(length(x), sum(x), mean(x))) prop.test(x = c(sum(gcd$qual[maturity.bin[[2]]%in%"01 (-Inf,8)"]), sum(gcd$qual[maturity.bin[[2]]%in%"02 [8,16)"])), n = c(length(gcd$qual[maturity.bin[[2]]%in%"01 (-Inf,8)"]), length(gcd$qual[maturity.bin[[2]]%in%"02 [8,16)"])), alternative = "less", correct = FALSE)$p.value #continuous target age.bin <- sts.bin(x = gcd$age, y = gcd$qual, y.type = "cont") age.bin[[1]] t.test(x = gcd$qual[age.bin[[2]]%in%"01 (-Inf,26)"], y = gcd$qual[age.bin[[2]]%in%"02 [26,35)"], alternative = "greater")$p.value
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.