Home

/

CRAN

/

rbin

/

Introduction to rbin"
In rbin: Tools for Binning Data

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(rbin)

Introduction

Binning is the process of transforming numerical or continuous data into categorical data. It is a common data pre-processing step of the model building process.

rbin has the following features:

manual binning using shiny app
equal length binning method
winsorized binning method
quantile binning method
combine levels of categorical data
create dummy variables based on binning method
calculates weight of evidence (WOE), entropy and information value (IV)
provides summary information about binning pre-processing

Manual Binning

For manual binning, you need to specify the cut points for the bins. rbin follows the left closed and right open interval ([0,1) = {x | 0 ≤ x < 1}) for creating bins. The number of cut points you specify is one less than the number of bins you want to create i.e. if you want to create 10 bins, you need to specify only 9 cut points as shown in the below example. The accompanying RStudio addin, rbinAddin() can be used to iteratively bin the data and to enforce monotonic increasing/decreasing trend.

After finalizing the bins, you can use rbin_create() to create the dummy variables.

Bins

bins <- rbin_manual(mbank, y, age, c(29, 31, 34, 36, 39, 42, 46, 51, 56))
bins

Plot

# plot
plot(bins)

Dummy Variables

bins <- rbin_manual(mbank, y, age, c(29, 31, 34, 36, 39, 42, 46, 51, 56))
rbin_create(mbank, age, bins)

Factor Binning

You can collapse or combine levels of a factor/categorical variable using rbin_factor_combine() and then use rbin_factor() to look at weight of evidence, entropy and information value. After finalizing the bins, you can use rbin_factor_create() to create the dummy variables. You can use the RStudio addin, rbinFactorAddin() to interactively combine the levels and create dummy variables after finalizing the bins.

Combine Levels

upper <- c("secondary", "tertiary")
out <- rbin_factor_combine(mbank, education, upper, "upper")
table(out$education)

out <- rbin_factor_combine(mbank, education, c("secondary", "tertiary"), "upper")
table(out$education)

Bins

bins <- rbin_factor(mbank, y, education)
bins

Plot

# plot
plot(bins)

Create Bins

upper <- c("secondary", "tertiary")
out <- rbin_factor_combine(mbank, education, upper, "upper")
rbin_factor_create(out, education)

Quantile Binning

Quantile binning aims to bin the data into roughly equal groups using quantiles.

bins <- rbin_quantiles(mbank, y, age, 10)
bins

Plot

# plot
plot(bins)

Equal Length Binning

Equal length binning creates bins of equal widths. It is different from equal frequency binning which creates bins of equal size.

bins <- rbin_equal_length(mbank, y, age, 10)
bins

Plot

# plot
plot(bins)

Winsorized Binning

Winsorized binning is similar to equal length binning except that both tails are cut off to obtain a smooth binning result. This technique is often used to remove outliers during the data pre-processing stage. For Winsorized binning, the Winsorized statistics are computed first. After the minimum and maximum have been found, the split points are calculated the same way as in equal length binning.

bins <- rbin_winsorize(mbank, y, age, 10, winsor_rate = 0.05)
bins

Plot

# plot
plot(bins)

Any scripts or data that you put into this service are public.

rbin documentation built on July 8, 2020, 7:31 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rbin
Tools for Binning Data

Introduction to rbin"
In rbin: Tools for Binning Data

Introduction

Manual Binning

Bins

Plot

Dummy Variables

Factor Binning

Combine Levels

Bins

Plot

Create Bins

Quantile Binning

Plot

Equal Length Binning

Plot

Winsorized Binning

Plot

Try the rbin package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

rbin Tools for Binning Data

Introduction to rbin" In rbin: Tools for Binning Data

Introduction

Manual Binning

Bins

Plot

Dummy Variables

Factor Binning

Combine Levels

Bins

Plot

Create Bins

Quantile Binning

Plot

Equal Length Binning

Plot

Winsorized Binning

Plot

Try the rbin package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

rbin
Tools for Binning Data

Introduction to rbin"
In rbin: Tools for Binning Data