binned_residuals: Binned residuals for binomial logistic regression

Description Usage Arguments Details Value Note References Examples

View source: R/binned_residuals.R

Description

Check model quality of binomial logistic regression models.

Usage

1
binned_residuals(model, term = NULL, n_bins = NULL, ...)

Arguments

model

A glm-object with binomial-family.

term

Name of independent variable from x. If not NULL, average residuals for the categories of term are plotted; else, average residuals for the estimated probabilities of the response are plotted.

n_bins

Numeric, the number of bins to divide the data. If n_bins = NULL, the square root of the number of observations is taken.

...

Further argument like size (for point-size) or color (for point-colors).

Details

Binned residual plots are achieved by “dividing the data into categories (bins) based on their fitted values, and then plotting the average residual versus the average fitted value for each bin.” (Gelman, Hill 2007: 97). If the model were true, one would expect about 95\

If term is not NULL, one can compare the residuals in relation to a specific model predictor. This may be helpful to check if a term would fit better when transformed, e.g. a rising and falling pattern of residuals along the x-axis is a signal to consider taking the logarithm of the predictor (cf. Gelman and Hill 2007, pp. 97-98).

Value

A data frame representing the data that is mapped in the accompanying plot. In case all residuals are inside the error bounds, points are black. If some of the residuals are outside the error bounds (indicated by the grey-shaded area), blue points indicate residuals that are OK, while red points indicate model under- or over-fitting for the relevant range of estimated probabilities.

Note

Since binned_residuals() returns a data frame, the default action for the result is printing. However, the print()-method for binned_residuals() actually creates a plot. For further modifications of the plot, use print() and add ggplot-layers to the return values, e.g. print(binned_residuals(model)) + see::scale_color_pizza().

References

Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge; New York: Cambridge University Press.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
if (require("see")) {
  # creating a model
  model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")

  # this will automatically plot the results
  (result <- binned_residuals(model))

  # if you assign results to an object, you can also look at the dataframe
  as.data.frame(result)
}

Example output

Loading required package: see
Warning: Probably bad model fit. Only about 50% of the residuals are inside the error bounds.
        xbar        ybar n       x.lo       x.hi         se group
1 0.03786483 -0.03786483 5 0.01744776 0.06917366 0.01937882    no
2 0.09514191 -0.09514191 5 0.07087498 0.15160143 0.02873921    no
3 0.25910531  0.07422802 6 0.17159955 0.35374001 0.43367801   yes
4 0.47954643 -0.07954643 5 0.38363314 0.54063600 0.50744089   yes
5 0.71108931  0.28891069 5 0.57299903 0.89141359 0.11199575    no
6 0.97119262 -0.13785929 6 0.91147360 0.99815623 0.30981245   yes

performance documentation built on Oct. 1, 2021, 5:08 p.m.