bin_cols: Bin Cols

Description Usage Arguments Details Value Examples

View source: R/make_bins.R

Description

Make bins in a tidy fashion. Adds a column to your data frame containing the integer codes of the specified bins of a certain column. Specifying multiple columns is only intended for supervised binning, so mutliple columns can be simultaneously binned optimally with respect to a target variable.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
bin_cols(
  .data,
  col,
  n_bins = 10,
  bin_type = "frequency",
  ...,
  target = NULL,
  pretty_labels = FALSE,
  seed = 1,
  method = "mdlp"
)

Arguments

.data

a data frame

col

a column, vector of columns, or tidyselect

n_bins

number of bins

bin_type

method to make bins

...

params to be passed to selected binning method

target

unquoted column for supervised binning

pretty_labels

logical. If T returns interval label rather than integer rank

seed

seed for stochastic binning (xgboost)

method

method for bin mdlp

Details

Description of the arguments for bin_type

Value

a data frame

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
iris %>%
bin_cols(Sepal.Width, n_bins = 5, pretty_labels = TRUE) %>%
bin_cols(Petal.Width, n_bins = 3, bin_type = c("width", "kmeans")) %>%
bin_cols(Sepal.Width, bin_type = "xgboost", target = Species, seed = 1) -> iris1

#binned columns are named by original name + method abbreviation + number bins created.
#Sometimes the actual number of bins is less than n_bins if the col lacks enough variance.
iris1 %>%
print(width = Inf)

iris1 %>%
bin_summary() %>%
print(width = Inf)

tidybins documentation built on Oct. 14, 2021, 5:22 p.m.