esm_max: Fit and validate Maximum Entropy Models based on Ensemble of...
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

esm_max

R Documentation

Fit and validate Maximum Entropy Models based on Ensemble of Small of Model approach

Description

This function constructs Maxent Models using the Ensemble of Small Model (ESM) approach (Breiner et al., 2015, 2018).

Usage

esm_max(
  data,
  response,
  predictors,
  partition,
  thr = NULL,
  background = NULL,
  clamp = TRUE,
  classes = "default",
  pred_type = "cloglog",
  regmult = 2.5
)

Arguments

`data`	data.frame. Database with the response (0,1) and predictors values.
`response`	character. Column name with species absence-presence data (0,1)
`predictors`	character. Vector with the column names of quantitative predictor variables (i.e. continuous variables). This function can only construct models with continuous variables, and does not allow categorical variables Usage predictors = c("aet", "cwd", "tmin").
`partition`	character. Column name with training and validation partition groups.
`thr`	character. Threshold used to get binary suitability values (i.e. 0,1). It is useful for threshold-dependent performance metrics. It is possible to use more than one threshold type. It is necessary to provide a vector for this argument. The following threshold criteria are available: equal_sens_spec: Threshold at which the sensitivity and specificity are equal. max_sens_spec: Threshold at which the sum of the sensitivity and specificity is the highest (aka threshold that maximizes the TSS). max_jaccard: The threshold at which Jaccard is the highest. max_sorensen: The threshold at which Sorensen is highest. max_fpb: The threshold at which FPB (F-measure on presence-background data) is highest. sensitivity: Threshold based on a specified sensitivity value. Usage thr = c('sensitivity', sens='0.6') or thr = c('sensitivity'). 'sens' refers to sensitivity value. If no sensitivity value is specified, the default is 0.9 If the user wants to include more than one threshold type, it is necessary concatenate threshold types, e.g., thr=c('max_sens_spec', 'max_jaccard'), or thr=c('max_sens_spec', 'sensitivity', sens='0.8'), or thr=c('max_sens_spec', 'sensitivity'). Function will use all thresholds if no threshold is specified.
`background`	data.frame. Database with response column only with 0 and predictors variables. All column names must be consistent with data. Default NULL
`clamp`	logical. It is set with TRUE, predictors and features are restricted to the range seen during model training.
`classes`	character. A single feature of any combinations of them. Features are symbolized by letters: l (linear), q (quadratic), h (hinge), p (product), and t (threshold). Usage classes = "lpq". Default "default" (see details).
`pred_type`	character. Type of response required available "link", "exponential", "cloglog" and "logistic". Default "cloglog"
`regmult`	numeric. A constant to adjust regularization. Because ESM are used for modeling species with few records default value is 2.5

Details

This method consists of creating bivariate models with all the pair-wise combinations of predictors and perform an ensemble based on the average of suitability weighted by Somers' D metric (D = 2 x (AUC -0.5)). ESM is recommended for modeling species with few occurrences. This function does not allow categorical variables because the use of these types of variables could be problematic when using with few occurrences. For further detail see Breiner et al. (2015, 2018). This function use a default regularization multiplier equal to 2.5 (see Breiner et al., 2018)

When the argument “classes” is set as default MaxEnt will use different features combination depending of the number of presences (np) with the follow rule: if np < 10 classes = "l", if np between 10 and 15 classes = "lq", if np between 15 and 80 classes = "lqh", and if np >= 80 classes = "lqph"

When presence-absence (or presence-pseudo-absence) data are used in data argument in addition to background points, the function will fit models with presences and background points and validate with presences and absences. This procedure makes maxent comparable to other presences-absences models (e.g., random forest, support vector machine). If only presences and background points data are used, function will fit and validate model with presences and background data. If only presence-absences are used in data argument and without background, function will fit model with the specified data (not recommended).

Value

A list object with:

esm_model: A list with "maxnet" class object from maxnet package for each bivariate model. This object can be used for predicting ensembles of small models with sdm_predict function.
predictors: A tibble with variables use for modeling.
performance: Performance metrics (see sdm_eval). Those threshold dependent metric are calculated based on the threshold specified in the argument.

References

Breiner, F. T., Guisan, A., Bergamini, A., & Nobis, M. P. (2015). Overcoming limitations of modelling rare species by using ensembles of small models. Methods in Ecology and Evolution, 6(10), 1210-218. https://doi.org/10.1111/2041-210X.12403
Breiner, F. T., Nobis, M. P., Bergamini, A., & Guisan, A. (2018). Optimizing ensembles of small models for predicting the distribution of species with few occurrences. Methods in Ecology and Evolution, 9(4), 802-808. https://doi.org/10.1111/2041-210X.12957

Examples

## Not run: 
data("abies")
data("backg")
require(dplyr)

# Using k-fold partition method
set.seed(10)
abies2 <- abies %>%
  na.omit() %>%
  group_by(pr_ab) %>%
  dplyr::slice_sample(n = 10) %>%
  group_by()

abies2 <- part_random(
  data = abies2,
  pr_ab = "pr_ab",
  method = c(method = "rep_kfold", folds = 5, replicates = 5)
)
abies2

set.seed(10)
backg2 <- backg %>%
  na.omit() %>%
  group_by(pr_ab) %>%
  dplyr::slice_sample(n = 100) %>%
  group_by()

backg2 <- part_random(
  data = backg2,
  pr_ab = "pr_ab",
  method = c(method = "rep_kfold", folds = 5, replicates = 5)
)
backg2

# Without threshold specification and with kfold
esm_max_t1 <- esm_max(
  data = abies2,
  response = "pr_ab",
  predictors = c("aet", "cwd", "tmin", "ppt_djf", "ppt_jja", "pH", "awc", "depth"),
  partition = ".part",
  thr = NULL,
  background = backg2,
  clamp = TRUE,
  classes = "default",
  pred_type = "cloglog",
  regmult = 1
)

esm_max_t1$esm_model # bivariate model
esm_max_t1$predictors
esm_max_t1$performance

## End(Not run)

sjevelazco/flexsdm documentation built on June 1, 2025, 6:13 p.m.

sjevelazco/flexsdm index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sjevelazco/flexsdm
Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

esm_max: Fit and validate Maximum Entropy Models based on Ensemble of...
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

Fit and validate Maximum Entropy Models based on Ensemble of Small of Model approach

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to esm_max in sjevelazco/flexsdm...

R Package Documentation

Browse R Packages

We want your feedback!

sjevelazco/flexsdm Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

esm_max: Fit and validate Maximum Entropy Models based on Ensemble of... In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

Fit and validate Maximum Entropy Models based on Ensemble of Small of Model approach

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to esm_max in sjevelazco/flexsdm...

R Package Documentation

Browse R Packages

We want your feedback!

sjevelazco/flexsdm
Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

esm_max: Fit and validate Maximum Entropy Models based on Ensemble of...
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models