bm_Tuning: Tune models parameters

View source: R/bm_Tuning.R

bm_TuningR Documentation

Tune models parameters

Description

This internal biomod2 function allows to tune single model parameters and select more efficient ones based on an evaluation metric.

Usage

bm_Tuning(
  model,
  tuning.fun,
  do.formula = FALSE,
  do.stepAIC = FALSE,
  bm.options,
  bm.format,
  calib.lines = NULL,
  metric.eval = "TSS",
  metric.AIC = "AIC",
  weights = NULL,
  ctrl.train = NULL,
  params.train = list(ANN.size = c(2, 4, 6, 8), ANN.decay = c(0.001, 0.01, 0.05, 0.1),
    ANN.bag = FALSE, FDA.degree = 1:2, FDA.nprune = 2:38, GAM.select = c(TRUE, FALSE),
    GAM.method = c("GCV.Cp", "GACV.Cp", "REML", "P-REML", "ML", "P-ML"), GAM.span =
    c(0.3, 0.5, 0.7), GAM.degree = 1, GBM.n.trees = c(500, 1000, 2500),
    GBM.interaction.depth = seq(2, 8, by = 3), GBM.shrinkage = c(0.001, 0.01, 0.1),
    GBM.n.minobsinnode = 10, MARS.degree = 1:2, MARS.nprune = 2:max(38, 2 *
    ncol(bm.format@data.env.var) + 1), MAXENT.algorithm = "maxnet", 
     MAXENT.parallel
    = TRUE, RF.mtry = 1:min(10, ncol(bm.format@data.env.var)), SRE.quant = c(0, 0.0125,
    0.025, 0.05, 0.1), XGBOOST.nrounds = 50, XGBOOST.max_depth = 1, XGBOOST.eta = c(0.3,
    0.4), XGBOOST.gamma = 0, XGBOOST.colsample_bytree = c(0.6, 0.8),
    XGBOOST.min_child_weight = 1, XGBOOST.subsample = 0.5)
)

Arguments

model

a character corresponding to the algorithm to be tuned, must be either ANN, CTA, FDA, GAM, GBM, GLM, MARS, MAXENT, MAXNET, RF, SRE, XGBOOST

tuning.fun

a character corresponding to the model function name to be called through train function for tuning parameters (see ModelsTable dataset)

do.formula

(optional, default FALSE)
A logical value defining whether formula is to be optimized or not

do.stepAIC

(optional, default FALSE)
A logical value defining whether variables selection is to be performed for GLM and GAM models or not

bm.options

a BIOMOD.options.default or BIOMOD.options.dataset object returned by the bm_ModelingOptions function

bm.format

a BIOMOD.formated.data or BIOMOD.formated.data.PA object returned by the BIOMOD_FormatingData function

calib.lines

(optional, default NULL)
A data.frame object returned by get_calib_lines or bm_CrossValidation functions

metric.eval

a character corresponding to the evaluation metric to be used, must be either AUC, Kappa or TSS for SRE only ; auc.val.avg, auc.diff.avg, or.mtp.avg, or.10p.avg, AICc for MAXENT only ; ROC or TSS for all other models

metric.AIC

a character corresponding to the AIC metric to be used, must be either AIC or BIC

weights

(optional, default NULL)
A vector of numeric values corresponding to observation weights (one per observation, see Details)

ctrl.train

(optional, default NULL)
A trainControl object

params.train

a list containing values of model parameters to be tested (see Details)

Details

Concerning ctrl.train parameter :

Set by default to :

ctrl.train <- caret::trainControl(method = "repeatedcv", repeats = 3, number = 10,
summaryFunction = caret::twoClassSummary,
classProbs = TRUE, returnData = FALSE)

Concerning params.train parameter :

All elements of the list must have names matching model.parameter_name format, parameter_name being one of the parameter of the tuning.fun function called by caret package and that can be found through the getModelInfo function.

Currently, the available parameters to be tuned are the following :

ANN

size, decay, bag

CTA

maxdepth

FDA

degree, nprune

GAM.gam

span, degree

GAM.mgcv

select, method

GBM

n.trees, interaction.depth, shrinkage, n.minobsinnode

MARS

degree, nprune

MAXENT

algorithm, parallel

RF

mtry

SRE

quant

XGBOOST

nrounds, max_depth, eta, gamma, colsampl_bytree, min_child_weight, subsample

The expand.grid function is used to build a matrix containing all combinations of parameters to be tested.

Value

A BIOMOD.models.options object (see bm_ModelingOptions) with optimized parameters

Note

  • No tuning for GLM and MAXNET

  • MAXENT is tuned through ENMevaluate function which is calling either :

    • maxnet (by defining MAXENT.algorithm = 'maxnet') (default)

    • Java version of Maxent defined in dismo package (by defining MAXENT.algorithm = 'maxent.jar')

  • SRE is tuned through bm_SRE function

  • All other models are tuned through train function

  • No optimization of formula for MAXENT, MAXNET, SRE and XGBOOST

  • No interaction included in formula for CTA

  • Variables selection only for GAM.gam and GLM

Author(s)

Frank Breiner, Maya Gueguen, Helene Blancheteau

See Also

trainControl, train, ENMevaluate, ModelsTable, BIOMOD.models.options, bm_ModelingOptions, BIOMOD_Modeling

Other Secundary functions: bm_BinaryTransformation(), bm_CrossValidation(), bm_FindOptimStat(), bm_MakeFormula(), bm_ModelingOptions(), bm_PlotEvalBoxplot(), bm_PlotEvalMean(), bm_PlotRangeSize(), bm_PlotResponseCurves(), bm_PlotVarImpBoxplot(), bm_PseudoAbsences(), bm_RunModelsLoop(), bm_SRE(), bm_SampleBinaryVector(), bm_SampleFactorLevels(), bm_VariablesImportance()

Examples

library(terra)

# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)

# Select the name of the studied species
myRespName <- 'GuloGulo'

# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])

# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]

# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)



# --------------------------------------------------------------- #
# Format Data with true absences
myBiomodData <- BIOMOD_FormatingData(resp.var = myResp,
                                     expl.var = myExpl,
                                     resp.xy = myRespXY,
                                     resp.name = myRespName)


# --------------------------------------------------------------- #
# List of all models currently available in `biomod2` (and their related package and function)
# Some of them can be tuned through the `train` function of the `caret` package 
# (and corresponding training function to be used is indicated)
data(ModelsTable)
ModelsTable

allModels <- c('ANN', 'CTA', 'FDA', 'GAM', 'GBM', 'GLM'
               , 'MARS', 'MAXENT', 'MAXNET', 'RF', 'SRE', 'XGBOOST')

# default parameters
opt.d <- bm_ModelingOptions(data.type = 'binary',
                            models = allModels,
                            strategy = 'default')
                            
# tune parameters for Random Forest model
tuned.rf <- bm_Tuning(model = 'RF',
                      tuning.fun = 'rf', ## see in ModelsTable
                      do.formula = FALSE,
                      bm.options = opt.d@options$RF.binary.randomForest.randomForest,
                      bm.format = myBiomodData)
tuned.rf

## Not run: 
# tune parameters for GAM (from mgcv package) model
tuned.gam <- bm_Tuning(model = 'GAM',
                       tuning.fun = 'gam', ## see in ModelsTable
                       do.formula = TRUE,
                       do.stepAIC = TRUE,
                       bm.options = opt.d@options$GAM.binary.mgcv.gam,
                       bm.format = myBiomodData)
tuned.gam

## End(Not run)                  




biomod2 documentation built on June 22, 2024, 10:56 a.m.