eRic: Eric's R functions developed while a summer analytics intern at Enova

Description Usage Arguments Details Value Examples

Statistics for y are plotted with respect to each level or bin of x. Plotted statistics can be proportions, log-odds, or weight-of-evidence values. Bins can be created using raw factor levels, quantile breakpoints, uniform breakpoints, or recursive partitioning. Additional arguments may be passed to rpart.control() to fine-tune recursive partitioning. Plots showing the ymetric for each value of xsplit as well as the total volume in each bin are printed to the current graphics device. In addition, two measures of the overall strength of the predictive relationship (Information Value & ChiSq) are calculated and returned.

1 2	plotYXbin(y, x, ymetric = "proportion", xsplit = "quantile", nbins = 10, nabin = TRUE, yticks = 6, ...)

`y`	(numeric) binary response vector
`x`	(numeric) numeric or factor predictor vector
`ymetric`	(character) statistic to calculate for `y`: `c('proportion', 'logodds', 'woe')`
`xsplit`	(character) method used to bin `x`: `c('quantile', 'uniform', 'rpart')`
`nbins`	(numeric) number of bins to create from `x`
`nabin`	(logical) whether to include an additional bin for missing `x` values
`yticks`	(numeric) number of tick marks to display on the y-axis of plots
`...`	(args) additional arguments to pass to `rpart.control()`

If xsplit='rpart' bins will be created based on recursive partitioning for both numeric and factor variables and the nbins argument will be ignored. Pass additional control parameters (e.g. cp, minbucket) in the function call to control partitioning behavior. If zero or greater than 20 bins are created using the rpart control settings passed the function will throw an error. If x is a factor variable the x-axis labels on the returned plots will correspond to the index positions of the levels of x (and not the factor labels themselves) in each bin. It's generally not a good idea to use recursive partitioning with more than 50 factor levels. If x is a numeric variable the x-axis labels will be the range cutpoints for each bin created via recursive partitioning.

If xsplit=c('uniform','quantile') and x is a factor variable its levels are used directly as bins and the nbins argument will be ignored. If x is a numeric variable bins are calculated by dividing the range of x into buckets of either equal size (uniform) or equal count (quantile). If quantile breakpoints are not unique then adjacent identical bins will be combined.

If bins get created which have either zero volume or zero variance then log-odds and woe cannot be calculated. Any such bins will be excluded from both the displayed plots and also the calculation of information value for the variable. This problem can typically be solved by using quantile binning and/or reducing the number of bins created.

a list containing the following elements:

iv - Information Value
chi2 - ChiSq Statistic
yPlot - ggplot object of ymetric vs. bins
vPlot - ggplot object of bin sizes or volume

data(diamonds, package = 'ggplot2')
y  <- as.numeric(diamonds$price > mean(diamonds$price))
x1 <- diamonds$carat
x2 <- diamonds$clarity
x3 <- diamonds$y

res <- plotYXbin(y, x1) 
res <- plotYXbin(y, x1, nbins = 8, nabin = FALSE)
res <- plotYXbin(y, x2, ymetric = 'woe')
res <- plotYXbin(y, x3, ymetric = 'proportion', xsplit = 'rpart', cp = 1e-4, minbucket = 100)

etlundquist/eRic documentation built on May 16, 2019, 9:07 a.m.

etlundquist/eRic index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

etlundquist/eRic
Eric's R functions developed while a summer analytics intern at Enova

plotYXbin: Response (y) Statistics Within Levels/Bins of a Predictor (x)
In etlundquist/eRic: Eric's R functions developed while a summer analytics intern at Enova

Description

Usage

Arguments

Details

Value

Examples

Related to plotYXbin in etlundquist/eRic...

R Package Documentation

Browse R Packages

We want your feedback!

etlundquist/eRic Eric's R functions developed while a summer analytics intern at Enova

plotYXbin: Response (y) Statistics Within Levels/Bins of a Predictor (x) In etlundquist/eRic: Eric's R functions developed while a summer analytics intern at Enova

Description

Usage

Arguments

Details

Value

Examples

Related to plotYXbin in etlundquist/eRic...

R Package Documentation

Browse R Packages

We want your feedback!

etlundquist/eRic
Eric's R functions developed while a summer analytics intern at Enova

plotYXbin: Response (y) Statistics Within Levels/Bins of a Predictor (x)
In etlundquist/eRic: Eric's R functions developed while a summer analytics intern at Enova