ensemble_model: Creates ensemble models from several algorithms
In Model-R/modleR: A Workflow for Ecological Niche Models

View source: R/ensemble_model.R

ensemble_model

R Documentation

Creates ensemble models from several algorithms

Description

This function reads the output of final_model for each species and multiple algorithms and builds a simple ensemble model by calculating the mean of the final models in order to obtain one model per species. It also calculates median, standard deviation and range (maximum - minimum)

Usage

ensemble_model(species_name, occurrences, lon = "lon", lat = "lat",
  models_dir = "./models", final_dir = "final_models",
  ensemble_dir = "ensemble", proj_dir = "present", algorithms = NULL,
  which_ensemble = c("average"), which_final = c("raw_mean"),
  performance_metric = "TSSmax", dismo_threshold = "spec_sens",
  consensus_level = 0.5, png_ensemble = TRUE, write_occs = FALSE,
  write_map = FALSE, scale_models = TRUE, uncertainty = TRUE, ...)

Arguments

`species_name`	A character string with the species name. Because species name will be used as a directory name, avoid non-ASCII characters, spaces and punctuation marks. Recommendation is to adopt "Genus_species" format. See names in `example_occs` as an example
`occurrences`	A data frame with occurrence data. Data must have at least columns with latitude and longitude values of species occurrences. See `example_occs` as an example
`lon`	The name of the longitude column. Defaults to "lon"
`lat`	The name of the latitude column. Defaults to "lat"
`models_dir`	Folder path to save the output files. Defaults to "`./models`"
`final_dir`	Character. Name of the folder to save the output files. A subfolder will be created, defaults to "final_model"
`ensemble_dir`	Character string, name of the folder to save the output files. A subfolder will be created. Defaults to "`ensemble`"
`proj_dir`	Character. The name of the subfolder with the projection. Defaults to "present" but can be set according to the other projections (i.e. to execute the function in projected models)
`algorithms`	Character vector specifying which algorithms will be processed. Note that it can have length > 1, ex. `c("bioclim", "rf")`. Defaults to NULL: it no name is given it will process all algorithms present in the final_models folder
`which_ensemble`	Which method to apply consensus between algorithms will be used? Current options are: `best` Selects models from the best-performing algorithm. A performance metric must be specified (`performance_metric`). Parameter `which_final` indicates which model will be returned `average` Computes the means between models. Parameter `which_final` indicates which model will be returned `weighted_average` Computes a weighted mean between models. A performance metric must be specified. Parameter `which_final` indicates which model will be returned `median` Computes the median between models. Parameter `which_final` indicates which model will be returned `frequency` Computes the mean between binary models, which is analogous to calculating a relative consensus `consensus` Computes a binary model with the final consensus area. A `consensus_level` must be specified `pca` Computes a PCA between the models for each algorithm and extract the first axis, that summarizes variation between them
`which_final`	Which `final_model` will be used to calculate the average, weighted average or median ensembles? See `final_model`
`performance_metric`	Which performance metric will be used to define the `"best"` algorithm any in `c("AUC", "pROC", "TSSmax", "KAPPAmax", "CCR", "F_score", "Jaccard")`
`dismo_threshold`	Character string indicating threshold (cut-off) to transform raw_mean final models to binary for frequency and consensus methods. The options are from `threshold`: "`kappa`", "`spec_sens`", "`no_omission`", "`prevalence`", "`equal_sens_spec`", "`sensitivity`". Default value is "`spec_sens`"
`consensus_level`	Which proportion of binary models will be kept when creating `bin_consensus`
`png_ensemble`	Logical. If `TRUE` writes png files of the ensemble models
`write_occs`	Logical. If `TRUE` writes the occurrence points on the png file of the ensemble model
`write_map`	Logical. If `TRUE` adds a map contour to the png file of the ensemble models
`scale_models`	Logical. Whether input models should be scaled between 0 and 1
`uncertainty`	Calculates the uncertainty between models, as a range (maximum - minimum)
`...`	Other parameters from `writeRaster`

Value

Retuns a RasterStack with all generated statistics written in the ensemble_dir subfolder

Writes on disk raster files with the median, mean and standard deviation and range of the assembled models

If png_ensemble = TRUE writes .png figures in the ensemble_dir subfolder

Examples

## Not run: 
# run setup_sdmdata
sp <- names(example_occs)[1]
sp_coord <- example_occs[[1]]
sp_setup <- setup_sdmdata(species_name = sp,
                          occurrences = sp_coord,
                          predictors = example_vars,
                          clean_uni = TRUE)

# run do_many
sp_many <- do_many(species_name = sp,
                   predictors = example_vars,
                   bioclim = TRUE)

# run final_model
sp_final <- final_model(species_name = sp,
                        algorithms = c("bioclim"),
                        select_partitions = TRUE,
                        select_par = "TSSmax",
                        select_par_val = 0,
                        which_models = c("raw_mean"),
                        consensus_level = 0.5,
                        overwrite = TRUE)

# run ensemble model
sp_ensemble <- ensemble_model(species_name = sp,
                              occurrences = sp_coord,
                              overwrite = TRUE)

## End(Not run)

Model-R/modleR documentation built on Aug. 24, 2023, 6:50 p.m.