Description Usage Arguments Value Examples
Useful for assessing the relationship between two variables of interest. There are cases, especially when
outliers are involved, or many obs, that a scatterplot can be difficult to read. This function bins up one of the continuous
variables that would be used in a scatterplot and calculates the mean (or other function) of the continuous variable over
a range (discretized into categories) of the second continuous variable. It uses an equal depth binning algorithm
to compute these bins on the by
variable.
It can be used to assess the average of a binary target variable/prediction over a range of levels of a continuous or categorical variable.
1 |
indv |
vector. This the variable whose mean will be calculated over the categories of the |
byv |
vector. This is variable to be binned up, by which the |
nbins |
numeric. Number of bins to create when discretizing |
data |
logical. TRUE returns the aggregated data.table. FALSE returns noting. TRUE is default. |
plotNbin |
logical. TRUE plots the count of obs in each bin on top of each bar. TRUE is default. |
... |
additional barplot arguments. |
prints a barplot unless data==T, in which case the aggregated data.table is returned
1 2 3 4 5 6 7 8 9 10 | plotAvgBy(mtcars[,'mpg'], mtcars[,'drat'], nbins=8)
plotAvgBy(mtcars[,'mpg'], mtcars[,'drat'], nbins=5, plotNbin=F)
plotAvgBy(mtcars[,'mpg'], mtcars[,'drat'], nbins=5, plotNbin=F, data=T)
## Example with missing data
df <- mtcars
df$mpg[sample(1:nrow(mtcars), 5)] <- NA
df$drat[sample(1:nrow(mtcars), 5)] <- NA
plotAvgBy(df[,'mpg'], df[,'drat'], nbins=5, plotNbin=F, data=T)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.