ellipsoid_calibration: Calibration of ellipsoid-based ecological niche models
In marlonecobos/ellipsenm: Ecological Niche's Characterizations Using Ellipsoids

ellipsoid_calibration

R Documentation

Calibration of ellipsoid-based ecological niche models

Description

ellipsoid_calibration helps in creating and evaluating multiple candidate ellipsoid envelop models to find parameter settings that produce the best results.

Usage

ellipsoid_calibration(data, species, longitude, latitude, variables,
                      format_in = NULL, methods, level = 95,
                      selection_criteria = "S_OR_P", error = 5,
                      iterations = 500, percentage = 50,
                      parallel = FALSE, overwrite = FALSE,
                      output_directory = "calibration_results")

Arguments

`data`	(character or list) if character, vector of names of csv files containing all, training, and testing occurrences located in the working directory; if list, object resulted from `split_data`. Columns of tables must include: species, longitude, and latitude.
`species`	(character) name of the column with the name of the species.
`longitude`	(character) name of the column with longitude data.
`latitude`	(character) name of the column with latitude data.
`variables`	(character or list) if character, name of a folder containing subfolders of at least one set of variables; if list, object derived from `prepare_sets`. Sets of variables must contain at least two layers.
`format_in`	(character) if `variables` is character, format of the variables found in folders. Default = NULL.
`methods`	(character) methods to construct the ellipsoid ecological niche models to be tested. Available methods are: "covmat", "mve1", and "mve2". See details of `ellipsoid_fit`.
`level`	(numeric) the confidence level of a pairwise confidence region for the ellipsoid, expresed as percentage. Default = 95.
`selection_criteria`	(character) set of criteria to select best models, options are: "S_OR" (statistical significance and low omission) and "S_OR_P" (statistical significance, low omission, and low prevalence). See details. Default = "S_OR_P".
`error`	(numeric) value from 0 to 100 to represent the percentage of potential error (E) that the data could have due to any source of uncertainty. Default = 5.
`iterations`	(numeric) number of bootstrap iterations to be performed; default = 500.
`percentage`	(numeric) percentage of testing data to be used in each bootstrapped process for calculating the partial ROC. Default = 50.
`parallel`	(logical) whether or not to run analyses in parallel. If defined as TRUE, it will only run in parallel if the number of parameter settings to be tested is equal or larger than the number of cores available.
`overwrite`	(logical) whether or not to overwrite exitent results in `output_directory`. Default = FALSE.
`output_directory`	(character) name of the folder were results of model calibration and selection will be written.

Details

Statistical significance is assessed using the partial_ROC test, omission rates refer to the proportion of testing data known to be in suitable areas but predicted as unsuitable, and prevalence is the proportion of geographic and environmental space predicted as suitable.

The maximum expected omission rates are 1 - (level / 100). Thus, if level is determined as 95, an adequate omission rate should not be higher than 0.5.

Good models are expected to have low omission rates and low prevalence. This implies that the model is predicting correctly in smaller areas.

Value

An object of class calibration_ellipsoid with all results and details derived from the calibration process. A folder named output_directory, containing all results as well as a detailed HTML report will also be created.

Examples

# reading data
occurrences <- read.csv(system.file("extdata", "occurrences.csv",
                                    package = "ellipsenm"))
colnames(occurrences)

# raster layers of environmental data (this ones are masked to the accessible area)
# users must prepare their layers accordingly if using other data
vars <- raster::stack(list.files(system.file("extdata", package = "ellipsenm"),
                                 pattern = "bio", full.names = TRUE))

# preparing training and testing data
data_split <- split_data(occurrences, method = "random", longitude = "longitude",
                         latitude = "latitude", train_proportion = 0.75)

# sets of variables (example)
sets <- list(set_1 = c("bio_1", "bio_7", "bio_15"),
             set_2 = c("bio_1", "bio_12", "bio_15")) # change as needed

variable_sets <- prepare_sets(vars, sets)

# methods to create ellipsoids
methods <- c("covmat")

# model calibration process (Make sure to define your working directory first)
calib <- ellipsoid_calibration(data = data_split, species = "species",
                               longitude = "longitude", latitude = "latitude",
                               variables = variable_sets, methods = methods,
                               level = 99, selection_criteria = "S_OR_P",
                               error = 5, iterations = 500, percentage = 50,
                               output_directory = "calibration_results")

marlonecobos/ellipsenm documentation built on Oct. 18, 2023, 8:09 a.m.