binning_rgr: Binning by recursive information gain ratio maximization

binning_rgrR Documentation

Binning by recursive information gain ratio maximization

Description

The binning_rgr() finding intervals for numerical variable using recursive information gain ratio maximization.

Usage

binning_rgr(.data, y, x, min_perc_bins = 0.1, max_n_bins = 5, ordered = TRUE)

Arguments

.data

a data frame.

y

character. name of binary response variable. The variable must character of factor.

x

character. name of continuous characteristic variable. At least 5 different values. and Inf is not allowed.

min_perc_bins

numeric. minimum percetange of rows for each split or segment (controls the sample size), 0.1 (or 10 percent) as default.

max_n_bins

integer. maximum number of bins or segments to split the input variable, 5 bins as default.

ordered

logical. whether to build an ordered factor or not.

Details

This function can be usefully used when developing a model that predicts y.

Value

an object of "infogain_bins" class. Attributes of "infogain_bins" class is as follows.

  • class : "infogain_bins".

  • type : binning type, "infogain".

  • breaks : numeric. the number of intervals into which x is to be cut.

  • levels : character. levels of binned value.

  • raw : numeric. raw data, x argument value.

  • target : integer. binary response variable.

  • x_var : character. name of x variable.

  • y_var : character. name of y variable.

See Also

binning, binning_by, plot.infogain_bins.

Examples


library(dplyr)

# binning by recursive information gain ratio maximization using character
bin <- binning_rgr(heartfailure, "death_event", "creatinine")

# binning by recursive information gain ratio maximization using name
bin <- binning_rgr(heartfailure, death_event, creatinine)
bin

# summary optimal_bins class
summary(bin)

# visualize all information for optimal_bins class
plot(bin)

# visualize WoE information for optimal_bins class
plot(bin, type = "cross")

# visualize all information without typographic
plot(bin, type = "cross", typographic = FALSE)

# extract binned results
extract(bin) %>% 
  head(20)



dlookr documentation built on May 29, 2024, 2 a.m.