knitr::opts_chunk$set(echo = TRUE)


Brief description of the model calibration and selection process

st4 <- read.csv("calibration_results.csv")
sett <- as.character(st4[,1])
setts <- strsplit(sett, split = "_")
rm <- vector()
for (i in 1:length(setts)) {
rm[i] <- setts[[i]][2]
}
f.clas <- vector()
for (i in 1:length(setts)) {
f.clas[i] <- setts[[i]][4]
}
var.di <- vector()
for (i in 1:length(setts)) {
var.di[i] <- paste(setts[[i]][5:length(setts[[i]])], collapse = "_")
}
rm1 <- paste(unique(rm), collapse = ", ")
f.clas1 <- paste(unique(f.clas), collapse = ", ")
var.di1 <- paste(unique(var.di), collapse = ", ")
para <- rbind(rm1, f.clas1, var.di1)

This is the final report of the evaluation of candidate models during calibration implemented in kuenm.

In all, r length(st4[,1]) candidate models, with parameters reflecting all combinations of r length(unique(rm)) regularization multiplier settings, r length(unique(f.clas)) feature class combinations, and r length(unique(var.di)) distinct sets of environmental variables, have been evaluated. Model performance was evaluated based on statistical significance (Partial ROC), omission rates (OR), and the Akaike information criterion corrected for small sample sizes (AICc).

colnames(para) <- "Parameters"
row.names(para) <- c("Regularization multipliers", "Feature classes", "Sets of predictors")
knitr::kable(para, digits=c(0,0), row.names = TRUE, caption = "Table 1. Parameters used to produce candidate models.")


All the results presented below can be found in the folder where outputs from model calibration were written.



Model calibration statistics

In the following table, information about how many models met the three selection criteria is presented.

st <- read.csv("calibration_stats.csv")
colnames(st) <- c("Criteria",   "Number_of_models")
knitr::kable(st, digits=c(0,0), caption = "Table 2. General statistics of models that met distinct criteria.")



Models selected according to user-defined criteria

The following table contains the models selected according to the user's pre-defined criteria.

Note that if the selection criteria was "OR_AICc" (statistically significant models with omission rates below a predefined E, and among them those with lower AICc values), delta AICc values were recalculated only among models meeting the significance and omission rate criteria.

st1 <- read.csv("selected_models.csv")
colnames(st1) <- c("Model", "Mean_AUC_ratio",   "Partial_ROC", gsub("[.]", "%", colnames(st1)[4]), "AICc",  "Delta_AICc",   "W_AICc",   "N_parameters")
knitr::kable(st1, digits = c(0, 3, 3, 3, 3, 3, 3, 0), 
             caption = "Table 3. Performance statistics for models selected based on the user's pre-defined critera.")



Model performance plot

The figure below shows the position of the selected models in the distribution of all candidate models in terms of omission rates and AICc values.

Figure 1. Distribution of all models, non-statistically significant models, and selected models in terms of AICc and omission rate values.{width=60%}



Performance statistics for all models

Following the statistics of performance for all candidate models (a sample if more than 500 models) are presented. See file calibration_results.csv for an editable file with of results for all candidate models.

st4 <- read.csv("calibration_results.csv")
if (dim(st4)[1] > 500) {
   st4 <- st4[1:500, ]
}
colnames(st4) <-  c("Model",    "Mean_AUC_ratio",   "Partial_ROC", gsub("[.]", "%", colnames(st4)[4]), "AICc",  "Delta_AICc",   "W_AICc",   "N_parameters")
knitr::kable(st4, digits = c(0, 3, 3, 3, 3, 3, 3, 0), 
             caption = "Table 4. Performance statistics for candidate models.")


manubio13/ku.enm documentation built on Jan. 5, 2024, 5:55 a.m.