BIOMOD_Tuning: Tune models parameters

View source: R/BIOMOD_Tuning.R

BIOMOD_TuningR Documentation

Tune models parameters

Description

Function to tune biomod2 single models parameters

Usage

BIOMOD_Tuning(
  bm.format,
  bm.options = BIOMOD_ModelingOptions(),
  models = c("GLM", "GBM", "GAM", "CTA", "ANN", "SRE", "FDA", "MARS", "RF", "MAXENT"),
  metric.eval = "ROC",
  ctrl.train = NULL,
  ctrl.train.tuneLength = 30,
  ctrl.ANN = NULL,
  ctrl.CTA = NULL,
  ctrl.FDA = NULL,
  ctrl.GAM = NULL,
  ctrl.GBM = NULL,
  ctrl.GLM = NULL,
  ctrl.MARS = NULL,
  ctrl.RF = NULL,
  ANN.method = "avNNet",
  ANN.decay.tune = c(0.001, 0.01, 0.05, 0.1),
  ANN.size.tune = c(2, 4, 6, 8),
  ANN.maxit = 500,
  ANN.MaxNWts = 10 * (ncol(bm.format@data.env.var) + 1) + 10 + 1,
  MARS.method = "earth",
  GAM.method = "gam",
  GLM.method = "glmStepAIC",
  GLM.type = c("simple", "quadratic", "polynomial", "s_smoother"),
  GLM.interaction = c(0, 1),
  ME.cvmethod = "randomkfold",
  ME.overlap = FALSE,
  ME.kfolds = 10,
  ME.n.bg = 10000,
  ME.env = NULL,
  ME.metric = "ROC",
  ME.clamp = TRUE,
  ME.parallel = FALSE,
  ME.numCores = NULL,
  RF.method = "rf",
  weights = NULL
)

Arguments

bm.format

a BIOMOD.formated.data or BIOMOD.formated.data.PA object returned by the BIOMOD_FormatingData function

bm.options

a BIOMOD.models.options object returned by the BIOMOD_ModelingOptions function

models

a vector containing model names to be tuned,
must be among GLM, GBM, GAM, CTA, ANN, SRE, FDA, MARS, RF, MAXENT

metric.eval

a character corresponding to the evaluation metric used to select optimal models and tune parameters, must be either ROC or TSS (maximizing Sensitivity and Specificity)

ctrl.train

global control parameters that can be obtained from the trainControl function

ctrl.train.tuneLength

(see tuneLength parameter in train)

ctrl.ANN

control parameters for ANN

ctrl.CTA

control parameters for CTA

ctrl.FDA

control parameters for FDA

ctrl.GAM

control parameters for GAM

ctrl.GBM

control parameters for GBM

ctrl.GLM

control parameters for GLM

ctrl.MARS

control parameters for MARS

ctrl.RF

control parameters for RF

ANN.method

a character corresponding to the classification or regression model to use for ANN,
must be avNNet (see http://topepo.github.io/caret/train-models-by-tag.html#Neural_Network)

ANN.decay.tune

a vector of weight decay parameters for ANN

ANN.size.tune

a vector of size parameters (number of units in the hidden layer) for ANN

ANN.maxit

an integer corresponding to the maximum number of iterations for ANN

ANN.MaxNWts

an integer corresponding to the maximum allowable number of weights for ANN

MARS.method

a character corresponding to the classification or regression model to use for MARS,
must be earth (see http://topepo.github.io/caret/train-models-by-tag.html#Multivariate_Adaptive_Regression_Splines)

GAM.method

a character corresponding to the classification or regression model to use for GAM,
must be gam (see http://topepo.github.io/caret/train-models-by-tag.html#generalized-additive-model)

GLM.method

a character corresponding to the classification or regression model to use for GLM,
must be glmStepAIC (see http://topepo.github.io/caret/train-models-by-tag.html#Generalized_Linear_Model)

GLM.type

a vector of character corresponding to modeling types for GLM,
must be among simple, quadratic, polynomial, s_smoother

GLM.interaction

a vector of interaction types, must be among 0, 1

ME.cvmethod

a character corresponding to the method used to partition data for MAXENT,
must be randomkfold

ME.overlap

(optional, default FALSE)
A logical value defining whether to calculate pairwise metric of niche overlap or not (see calc.niche.overlap)

ME.kfolds

an integer corresponding to the number of bins for k-fold cross-validation for MAXENT

ME.n.bg

an integer corresponding to the number of background points used to run MAXENT

ME.env

a SpatRaster object containing model predictor variables

ME.metric

a character corresponding to the evaluation metric used to select optimal model and tune parameters for MAXENT, must be either auc.val.avg, auc.diff.avg, or.mtp.avg, or.10p.avg or AICc

ME.clamp

(optional, default TRUE)
A logical value defining whether Features are constrained to remain within the range of values in the training data (Elith et al. 2011) or not

ME.parallel

(optional, default TRUE)
A logical value defining whether to enable parallel computing for MAXENT or not

ME.numCores

an integer corresponding to the number of cores to be used to train MAXENT

RF.method

a character corresponding to the classification or regression model to use for RF,
must be rf (see http://topepo.github.io/caret/train-models-by-tag.html#random-forest)

weights

a vector of numeric values corresponding to observation weights

Details

  • ctrl.train parameter is set by default to :
    caret::trainControl(method = 'cv', summaryFunction = caret::twoClassSummary,
    classProbs = TRUE, returnData = FALSE).

  • All control parameters for other models are set to ctrl.train if unspecified.

  • For more details on MAXENT tuning, please refer to ENMevaluate.

  • For more details on other models tuning, please refer to train.

Value

A BIOMOD.models.options object (see BIOMOD_ModelingOptions) with optimized parameters

Author(s)

Frank Breiner

References

  • Kuhn, Max. 2008. Building predictive models in R using the caret package. Journal of Statistical Software 28, 1-26.

  • Kuhn, Max, and Kjell Johnson. 2013. Applied predictive modeling. New York: Springer.

  • Muscarella, Robert, et al. 2014. ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models. Methods in Ecology and Evolution, 5, 1198-1205.

See Also

trainControl, train, calc.niche.overlap, ENMevaluate, BIOMOD_ModelingOptions, BIOMOD_Modeling

Other Main functions: BIOMOD_EnsembleForecasting(), BIOMOD_EnsembleModeling(), BIOMOD_FormatingData(), BIOMOD_LoadModels(), BIOMOD_ModelingOptions(), BIOMOD_Modeling(), BIOMOD_PresenceOnly(), BIOMOD_Projection(), BIOMOD_RangeSize()

Examples

library(terra)

# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)

# Select the name of the studied species
myRespName <- 'GuloGulo'

# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])

# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]

# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)



# --------------------------------------------------------------- #
# Format Data with true absences
myBiomodData <- BIOMOD_FormatingData(resp.var = myResp,
                                     expl.var = myExpl,
                                     resp.xy = myRespXY,
                                     resp.name = myRespName)


# --------------------------------------------------------------- #
### Duration for turing all models sequential with default settings 
### on 3.4 GHz processor: approx. 45 min tuning all models in parallel
### (on 8 cores) using foreach loops runs much faster: approx. 14 min

## Not run: 
# library(doParallel)
# cl <- makeCluster(8)
# doParallel::registerDoParallel(cl) 

time.seq <- system.time(
  bm.tuning <- BIOMOD_Tuning(bm.format = myBiomodData, ME.env = myExpl, ME.n.bg = ncell(myExpl))
)

# stopCluster(cl)

plot(bm.tuning$tune.CTA.rpart)
plot(bm.tuning$tune.CTA.rpart2)
plot(bm.tuning$tune.RF)
plot(bm.tuning$tune.ANN)
plot(bm.tuning$tune.MARS)
plot(bm.tuning$tune.FDA)
plot(bm.tuning$tune.GBM)
plot(bm.tuning$tune.GAM)

# Get tuned modeling options
myBiomodOptions <- bm.tuning$models.options

## End(Not run)



biomod2 documentation built on July 9, 2023, 6:05 p.m.