get_numeric_bins: get_numeric_bins
In cjodice10/eda: Exploratory Data Analysis - Variable Groupings and Transformations

View source: R/get_numeric_bins.R

get_numeric_bins

R Documentation

get_numeric_bins

Description

Numeric grouping

Usage

get_numeric_bins(
  run_id,
  df,
  dv,
  dv.type,
  dv.denominator = NULL,
  var.list,
  nbins = 20,
  min.Pct = 0.02,
  binning.Type = "Bucketing",
  monotonic = TRUE,
  tracking = TRUE,
  path_2_save = getwd()
)

Arguments

`run_id`	An identifier that will be used when naming output tables to the specified path (path_2_save parameter). Example: 'MyRun1'
`df`	A dataframe you are wanting to analyze
`dv`	The name of the dependent variable (dv). Example: 'target'
`dv.type`	Can take on 1 of two inpunts - c('Binary','Frequency'). Both should be numeric. If 'Frequency' is the input, it should be the numerator (if it is a rate). The denominator will be specified as a separate parameter
`dv.denominator`	The denominator of your dependent variable. In many cases, this can be considered the exposure
`var.list`	A list of non-numeric variables to analyze and create bins for
`nbins`	Maximum number of bins to initially split the variable into. Default is 20
`min.Pct`	The minimun percent of records a final bin should have. The input should be between (0,1). Generally applies to only bins that are not NA. Default is 0.02 (or 2 percent)
`binning.Type`	The type of binning to use when splitting the variable. One of two can be selected: c("Bucketing","Quantiles"). "Bucketing" uses the cut() function where breaks=nbins. "Quantiles" uses the cut() function where breaks=c(-Inf, unique(quantile( tmpDF[,i],probs=seq(0,1, by=1/nbins),include.lowest=TRUE,na.rm=TRUE))))
`monotonic`	Logical TRUE/FALSE input. If TRUE, it will force the bins to be monotonic based on the event rate. Default is TRUE
`tracking`	Logical TRUE/FALSE input. If set to TRUE, the user will be able to see what variable the function is analyzing. Default is TRUE
`path_2_save`	A path to a folder to save a log file

Value

A list of dataframes. First in the list will be 'Numeric_eda' - this is an aggregated dataframe showing the groups created along with other key information. The second is 'numeric_iv' - This is a dataframe with each variable processed and their information value. The last is 'numeric_logics' - This is a dataframe with the information needed to apply to your dataframe and transform your variables. This table will be the input to apply_numeric_logic(logic_df=numeric_logics)

cjodice10/eda documentation built on Feb. 7, 2023, 3:26 p.m.

cjodice10/eda index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cjodice10/eda
Exploratory Data Analysis - Variable Groupings and Transformations

get_numeric_bins: get_numeric_bins
In cjodice10/eda: Exploratory Data Analysis - Variable Groupings and Transformations

get_numeric_bins

Description

Usage

Arguments

Value

Related to get_numeric_bins in cjodice10/eda...

R Package Documentation

Browse R Packages

We want your feedback!

cjodice10/eda Exploratory Data Analysis - Variable Groupings and Transformations

get_numeric_bins: get_numeric_bins In cjodice10/eda: Exploratory Data Analysis - Variable Groupings and Transformations

get_numeric_bins

Description

Usage

Arguments

Value

Related to get_numeric_bins in cjodice10/eda...

R Package Documentation

Browse R Packages

We want your feedback!

cjodice10/eda
Exploratory Data Analysis - Variable Groupings and Transformations

get_numeric_bins: get_numeric_bins
In cjodice10/eda: Exploratory Data Analysis - Variable Groupings and Transformations