Description Usage Arguments Details Value Examples
Statistics for y are plotted with respect to each level or bin of x. Plotted 
statistics can be proportions, log-odds, or weight-of-evidence values. Bins can be created 
using raw factor levels, quantile breakpoints, uniform breakpoints, or recursive partitioning. 
Additional arguments may be passed to rpart.control() to fine-tune recursive partitioning. 
Plots showing the ymetric for each value of xsplit as well as the total volume in 
each bin are printed to the current graphics device. In addition, two measures of the overall 
strength of the predictive relationship (Information Value & ChiSq) are calculated and returned.
| 1 2 | 
| y | (numeric) binary response vector | 
| x | (numeric) numeric or factor predictor vector | 
| ymetric | (character) statistic to calculate for  | 
| xsplit | (character) method used to bin  | 
| nbins | (numeric) number of bins to create from  | 
| nabin | (logical) whether to include an additional bin for missing  | 
| yticks | (numeric) number of tick marks to display on the y-axis of plots | 
| ... | (args) additional arguments to pass to  | 
If xsplit='rpart' bins will be created based on recursive partitioning for both 
numeric and factor variables and the nbins argument will be ignored. Pass additional 
control parameters (e.g. cp, minbucket) in the function call to control partitioning 
behavior. If zero or greater than 20 bins are created using the rpart control settings 
passed the function will throw an error. If x is a factor variable the x-axis labels on 
the returned plots will correspond to the index positions of the levels of x (and not 
the factor labels themselves) in each bin. It's generally not a good idea to use recursive 
partitioning with more than 50 factor levels. If x is a numeric variable the x-axis labels 
will be the range cutpoints for each bin created via recursive partitioning.
If xsplit=c('uniform','quantile') and x is a factor variable its levels are used 
directly as bins and the nbins argument will be ignored. If x is a numeric variable 
bins are calculated by dividing the range of x into buckets of either equal size (uniform) 
or equal count (quantile). If quantile breakpoints are not unique then adjacent identical 
bins will be combined. 
If bins get created which have either zero volume or zero variance then log-odds and woe cannot be calculated. Any such bins will be excluded from both the displayed plots and also the calculation of information value for the variable. This problem can typically be solved by using quantile binning and/or reducing the number of bins created.
a list containing the following elements:
iv - Information Value
chi2 - ChiSq Statistic
yPlot - ggplot object of ymetric vs. bins
vPlot - ggplot object of bin sizes or volume
| 1 2 3 4 5 6 7 8 9 10 | data(diamonds, package = 'ggplot2')
y  <- as.numeric(diamonds$price > mean(diamonds$price))
x1 <- diamonds$carat
x2 <- diamonds$clarity
x3 <- diamonds$y
res <- plotYXbin(y, x1) 
res <- plotYXbin(y, x1, nbins = 8, nabin = FALSE)
res <- plotYXbin(y, x2, ymetric = 'woe')
res <- plotYXbin(y, x3, ymetric = 'proportion', xsplit = 'rpart', cp = 1e-4, minbucket = 100)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.