The ggdiag package is a attempt to unify the diagnostic analysis in R. For most models, diagnostic graphics are hard to customize or to obtain. In some cases, the expression for the residuals needed are not even implemented.

Our main goal is to create a simple grammar to plot diagnostic graphics that can be used for any kind of statistical model. The strategy is based in to main functions: the diag_data() function and the plot.model_class() functions.

The diag_data() function receive a fitted model and returns a tibble with several diagnostic measures, such as residuals, cook's distance, leverage etc, depending on the class of the model. It uses broom::augment() function to create most of the measures.

The plot.model_class() functions are S3 methods for the generic function plot(). They receive a object of class diag_data and return a specified diagnostic graphic. The graphics are made using the ggplot2 package, and one can combine them with the plotly::ggplolty() function to identify points.

So far, only the lm and glm classes are available.

Installing

Diagnostic for linear models

The examples below show how build diagnostic graphics is easy using the diag_data() and plot() functions. After fitting the linear model, the major job is to choose the graphic type. For lm class, the options currently available are: "residuals" (default), "leverage" and "cook".

library(ggilberto)

fit_lm <- lm(mpg ~ qsec, data = mtcars)
diag_lm <- diag_data(fit_lm)
plot(diag_lm)
plot(diag_lm, graphic = "leverage")
plot(diag_lm, graphic = "cook")

Diagnostic for generalized linear models

The main difference for the generalized linear models is the "link" option to the graphic argument. It produces a link function plot for assessing whether the link function is adequate.

 fit_glm <- glm(mpg ~ qsec, family = Gamma(link = "log"), data = mtcars)
 diag_glm <- diag_data(fit_glm)

 plot(diag_glm, graphic = "link")


curso-r/ggilberto documentation built on May 20, 2019, 2:39 p.m.