Description Usage Arguments Details Value References See Also Examples
model_gam applies Generalized Additive Models (GAMs) to each IND~pressure
combination created in ind_init and returns a tibble with
IND~pressure-specific GAM outputs.
1 |
init_tbl |
The output tibble of the |
k |
Choice of knots (for the smoothing function |
family |
A description of the error distribution and link to be used in the GAM.
This needs to be defined as a family function (see also |
excl_outlier |
A list of values identified as outliers in specific
IND~pressure GAMs, which should be excluded in this modeling step
(the output tibble of this function includes the variable
'pres_outlier', which is a column-list containing
all indices of values with cook's distance > 1 (see below). The function
can be re-run again, then excluding all these outliers provided in
|
To evaluate the IND's sensitivity and robustness time series of the IND are
modeled as a smoothing function of one single pressure variable (using a subset
of the data as training dataset, e.g. excluding the years of the annual time series).
The GAMs are build using the default settings in the gam function and
the smooth term function s). However, the user can adjust
the distribution and link by modifying the family argument as well as the
maximum level of non-linearity by setting the number of knots:
gam(ind ~ s(press, k = k), family = family, data = training_data)
In the presence of significant temporal auto-correlation, GAMs should be extended to
Generalized Additive Mixed Models (GAMMs) by including auto-regressive error structures
to correct for the auto-correlation (Pinheiro and Bates, 2000). This is implemented in
the function model_gamm.
The returned tibble contains various model outputs needed for scoring the sensitivity and robustness subcriteria:
p_val to identify whether an IND responds to a specific pressure
r_sq for the strength of the IND response
edf for the non-linearity of the IND response
nrmse for the robustness of the established IND~pressure relationship
The robustness of the modeled pressure relationship based on the training data is evaluated by measuring how well the model prediction matches the test dataset, e.g. the last years. This is quantified by computing the absolute value of the normalized root mean square error (NRMSE) on the test dataset. The normalization to the mean of the observed test data allows for comparisons and a general scoring of the model robustness across INDs with different scales or units.
The function returns a tibble, which is a trimmed down version of
the data.frame(), including the following elements:
idNumerical IDs for the IND~press combinations.
indIndicator names.
pressPressure names.
model_typeSpecification of the model type; at this stage containing only "gam" (Generalized Additive Model).
corrstrucSpecification of the correlation structure; at this stage containing only "none".
aicAIC of the fitted models
edfEstimated degrees of freedom for the model terms.
p_valThe p values for the smoothing term (the pressure).
signif_codeThe significance codes for the p-values.
r_sqThe adjusted r-squared for the models. Defined as the proportion of variance explained, where original variance and residual variance are both estimated using unbiased estimators. This quantity can be negative if your model is worse than a one parameter constant model, and can be higher for the smaller of two nested models.
expl_devThe proportion of the null deviance explained by the models.
nrmseAbsolute values of the root mean square error normalized by the standard deviation (NRMSE).
ks_testThe p-values from a Kolmogorov-Smirnov Test applied on the model residuals to test for normal distribution. P-values > 0.05 indicate normally distributed residuals.
taclogical; indicates whether temporal autocorrelation (TAC) was detected in the residuals. TRUE if model residuals show TAC. NAs in the time series due to real missing values, test data extraction or exclusion of outliers are explicitly considered. The test is based on the following condition: if any of the acf and pacf values of lag 1 - 5 are greater than 0.4 or lower than -0.4, a TRUE is returned.
pres_outlierA list-column with all indices of values identified as outliers in each model (i.e.cook's distance > 1). The indices present the position in the training data, including NAs.
excl_outlierA list-column listing all outliers per model that have been excluded in the GAM fitting
modelA list-column of IND~press-specific gam objects that contain additionally
the logical vector indicating missing values ($train_na).
Pinheiro, J.C., Bates, D.M. (2000) Mixed-Effects Models in S and S-Plus. Springer, New York, 548pp.
tibble and the vignette("tibble") for more
informations on tibbles,
gam for more information on GAMs, and
plot_diagnostics for assessing the model diagnostics
Other IND~pressure modeling functions:
find_id(),
ind_init(),
model_gamm(),
plot_diagnostics(),
plot_model(),
scoring(),
select_model(),
test_interaction()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # Using the Baltic Sea demo data in this package
dat_init <- ind_init(
ind_tbl = ind_ex[, c("Sprat", "Cod")],
press_tbl = press_ex[, c("Tsum", "Swin", "Fcod", "Fher")],
time = ind_ex[ ,1])
gam_tbl <- model_gam(dat_init)
# Any outlier?
gam_tbl$pres_outlier
# Exclude outliers by passing this list as input:
gam_tbl_out <- model_gam(dat_init, excl_outlier = gam_tbl$pres_outlier)
# Using another error distribution
ind_sub <- round(exp(ind_ex[ ,c(2,8,9)]),0) # to unlog data and convert to integers
ind_tbl2 <- ind_init(ind_sub, press_ex, time = ind_ex$Year)
model_gam(ind_tbl2, family = poisson(link="log"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.