manip_bin_numerics: bin numerical columns

View source: R/manip.R

manip_bin_numericsR Documentation

bin numerical columns

Description

centers, scales and Yeo Johnson transforms numeric variables in a dataframe before binning into n bins of equal range. Outliers based on boxplot stats are capped (set to min or max of boxplot stats).

Usage

manip_bin_numerics(
  x,
  bins = 5,
  bin_labels = c("LL", "ML", "M", "MH", "HH"),
  center = T,
  scale = T,
  transform = T,
  round_numeric = T,
  digits = 2,
  NA_label = "NA"
)

Arguments

x

dataframe with numeric variables, or numeric vector

bins

number of bins for numerical variables, passed to cut as breaks parameter, Default: 5

bin_labels

labels for the bins from low to high, Default: c("LL", "ML", "M", "MH", "HH"). Can also be one of c('mean', 'median', 'min_max', 'cuts'), the corresponding summary function will supply the labels.

center

logical, Default: T

scale

logical, Default: T

transform

logical, apply Yeo Johnson Transformation, Default: T

round_numeric,

logical, rounds numeric results if bin_labels is supplied with a supported summary function name.

digits,

integer, number of digits to round to

NA_label

character vector, define label for missing data, Default: 'NA'

Value

dataframe

Examples

summary( mtcars2 )
summary( manip_bin_numerics(mtcars2) )
summary( manip_bin_numerics(mtcars2, bin_labels = 'mean'))
summary( manip_bin_numerics(mtcars2, bin_labels = 'cuts'
  , scale = FALSE, center = FALSE, transform = FALSE))

erblast/easyalluvial documentation built on Dec. 11, 2023, 7:28 p.m.