Description Usage Arguments Value Examples
The numerical independent varaible (x) is firstly divided into small buckets with approximate equal number of records. Then a univariate regression model is built using the bucketed x and dependent variable (y). The buckets with similar coefficients are classied into Visualize the binning for survival/logistic model based on model coefficients. The KNN algorithm is used to bin the small buckets into bigger groups, which takes into account both the orders and coefficients of the buckets.
1 |
formula |
The formula for logistic (y ~ x) or survival model (Surv(time, status) ~ x). |
data |
The data frame used for binning |
n.group |
Number of binning groups |
min.bucket |
The minimum proportion of population in the buckets (a value between 0 and 1) |
Shows a ggplot with the regression coefficients and the binned groups
1 2 3 4 5 6 7 8 9 | data <- rpart::stagec
bin.knn(pgstat ~ age, data = data, n.group = 4, min.bucket = .1)
# can be combine with the manipulate::manipulate function to change the
# binning interactively
library(manipulate)
manipulate(bin.knn(pgstat ~ age, data = data, n.group, min.bucket),
n.group = slider(1, 10, step = 1, initial = 5, label = 'Number of groups'),
min.bucket = slider(0.01, 1, step = 0.01, initial = 0.05,
label = 'Minimum Population Size (%)'))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.