plot_diagnostics: Diagnostic plots for a fitted GAM, GAMM or threshold-GAM(M)

Description Usage Arguments Details Value See Also Examples

View source: R/plot_diagnostics.R

Description

plot_diagnostics takes a list of models of class 'gam', 'gamm' or 'thresh_gam' or a mix of those and produces some diagnostic information of the fitting procedure and results. The function returns a tibble with 6 list-columns containing individual plots (ggplot2 objects) and one list-column containing a plot that shows all diagnostic plots together.

Usage

1
plot_diagnostics(model_list)

Arguments

model_list

A list with models of class gam(m) and/or thresh_gam, e.g. the list-column model from the model_gam output tibble.

Details

The function can deal with any model of the classes 'gam', 'gamm' or 'thresh_gam' as long as the input is a flat list. That means:

Value

The function returns a tibble, which is a trimmed down version of the data.frame(), including the following elements:

ind

Indicator names.

press

Pressure names.

cooks_dist

A list-column of ggplot2 objects that show the Cook's distance of all observations, which is a leave-one-out deletion diagnostics to measure the influence of each observation. Data points with a large Cook's distance (> 1) are considered to merit closer examination in the analysis.

acf_plot

A list-column of ggplot2 objects that show the autocorrelation function for the residuals. NAs in the time series due to real missing values, test data extraction or exclusion of outliers are explicitly considered.

pacf_plot

A list-column of ggplot2 objects that show the partial autocorrelation function for the residuals. NAs are explicitly considered.

resid_plot

A list-column of ggplot2 objects that show residuals vs. fitted values.

qq_plot

A list-column of ggplot2 objects that show the quantile-quantile plot for normality.

gcvv_plot

A list-column of ggplot2 objects that show for a threshold-GAM the development of the generalized cross-validation value at different thresholds level of the modifying pressure variable. The GCV value of the final chosen threshold should be distinctly lower than for all other potential thresholds, i.e., the line should show a pointy negative peak at this threshold. If this is not the case, e.g. the trough is very wide with similar GCV values for nearby thresholds, the threshold-GAM is not optimal and should not be favored over a GAM despite the better LOOCV (leave-one-out cross-validation value).

all_plots

A list-column of ggplot2 objects that show all five (six if threshold-GAM) plots together. For this plot, drawing canvas from the cowplot package were added on top of ggplot2.

See Also

cooks.distance, acf, pacf, qqnorm, and flatten for removing a level hierarchy from a list

Other IND~pressure modeling functions: find_id(), ind_init(), model_gamm(), model_gam(), plot_model(), scoring(), select_model(), test_interaction()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
# Using some models of the Baltic Sea demo data:
# Apply function to a list of various model types
model_list <- c(all_results_ex$thresh_models[[5]],
  model_gam_ex$model[39], all_results_ex$model[76])
plots <- plot_diagnostics(model_list)
plots$cooks_dist[[1]]
plots$acf_plot[[2]]
plots$pacf_plot[[3]]
plots$resid_plot[[1]]
plots$qq_plot[[1]]
plots$gcvv_plot[[1]] # for threshold models
plots$all_plots[[1]] # shows all 5-6 plots

# Make sure that thresh_models have not a nested list structure:
model_list <- all_results_ex$thresh_models[5:6] %>% purrr::flatten(.)
plots <- plot_diagnostics(model_list)

## End(Not run)

saskiaotto/INDperform documentation built on Oct. 27, 2021, 10:33 p.m.