fit_gbm: Fit and validate Generalized Boosted Regression models
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

fit_gbm

R Documentation

Fit and validate Generalized Boosted Regression models

Description

Fit and validate Generalized Boosted Regression models

Usage

fit_gbm(
  data,
  response,
  predictors,
  predictors_f = NULL,
  fit_formula = NULL,
  partition,
  thr = NULL,
  n_trees = 100,
  n_minobsinnode = as.integer(nrow(data) * 0.5/4),
  shrinkage = 0.1
)

Arguments

`data`	data.frame. Database with response (0,1) and predictors values.
`response`	character. Column name with species absence-presence data (0,1).
`predictors`	character. Vector with the column names of quantitative predictor variables (i.e. continuous variables). Usage predictors = c("aet", "cwd", "tmin")
`predictors_f`	character. Vector with the column names of qualitative predictor variables (i.e. ordinal or nominal variables type). Usage predictors_f = c("landform")
`fit_formula`	formula. A formula object with response and predictor variables (e.g. formula(pr_ab ~ aet + ppt_jja + pH + awc + depth + landform)). Note that the variables used here must be consistent with those used in response, predictors, and predictors_f arguments. Default is NULL.
`partition`	character. Column name with training and validation partition groups.
`thr`	character. Threshold used to get binary suitability values (i.e. 0,1) needed for threshold-dependent performance metrics. It is possible to use more than one threshold type. It is necessary to provide a vector for this argument. The following threshold criteria are available: lpt: The highest threshold at which there is no omission. equal_sens_spec: Threshold at which the sensitivity and specificity are equal. max_sens_spec: Threshold at which the sum of the sensitivity and specificity is the highest (aka threshold that maximizes the TSS). max_jaccard: The threshold at which the Jaccard index is the highest. max_sorensen: The threshold at which the Sorensen index is highest. max_fpb: The threshold at which FPB (F-measure on presence-background data) is highest. sensitivity: Threshold based on a specified sensitivity value. Usage thr = c('sensitivity', sens='0.6') or thr = c('sensitivity'). 'sens' refers to sensitivity value. If a sensitivity value is not specified, the default used is 0.9 If more than one threshold type is used they must be concatenated, e.g., thr=c('lpt', 'max_sens_spec', 'max_jaccard'), or thr=c('lpt', 'max_sens_spec', 'sensitivity', sens='0.8'), or thr=c('lpt', 'max_sens_spec', 'sensitivity'). Function will use all thresholds if no threshold is specified.
`n_trees`	Integer specifying the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion. Default is 100.
`n_minobsinnode`	Integer specifying the minimum number of observations in the terminal nodes of the trees. Note that this is the actual number of observations, not the total weight. The default value used is nrow(data)*0.5/4
`shrinkage`	Numeric. This parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction; 0.001 to 0.1 usually works, but a smaller learning rate typically requires more trees. Default is 0.1.

Value

A list object with:

model: A "gbm" class object from gbm package. This object can be used for predicting.
predictors: A tibble with quantitative (c column names) and qualitative (f column names) variables use for modeling.
performance: Performance metric (see sdm_eval). Threshold dependent metrics are calculated based on the threshold specified in thr argument.
data_ens: Predicted suitability for each test partition based on the best model. This database is used in fit_ensemble

Examples

## Not run: 
data("abies")

# Using k-fold partition method
abies2 <- part_random(
  data = abies,
  pr_ab = "pr_ab",
  method = c(method = "kfold", folds = 10)
)
abies2

gbm_t1 <- fit_gbm(
  data = abies2,
  response = "pr_ab",
  predictors = c("aet", "ppt_jja", "pH", "awc", "depth"),
  predictors_f = c("landform"),
  partition = ".part",
  thr = c("max_sens_spec", "equal_sens_spec", "max_sorensen")
)
gbm_t1$model
gbm_t1$predictors
gbm_t1$performance
gbm_t1$data_ens

# Using bootstrap partition method
abies2 <- part_random(
  data = abies,
  pr_ab = "pr_ab",
  method = c(method = "boot", replicates = 10, proportion = 0.7)
)
abies2

gbm_t2 <- fit_gbm(
  data = abies2,
  response = "pr_ab",
  predictors = c("ppt_jja", "pH", "awc"),
  predictors_f = c("landform"),
  partition = ".part",
  thr = "max_sens_spec"
)
gbm_t2

## End(Not run)

sjevelazco/flexsdm documentation built on June 1, 2025, 6:13 p.m.

sjevelazco/flexsdm index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sjevelazco/flexsdm
Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

fit_gbm: Fit and validate Generalized Boosted Regression models
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

Fit and validate Generalized Boosted Regression models

Description

Usage

Arguments

Value

See Also

Examples

Related to fit_gbm in sjevelazco/flexsdm...

R Package Documentation

Browse R Packages

We want your feedback!

sjevelazco/flexsdm Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

fit_gbm: Fit and validate Generalized Boosted Regression models In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

Fit and validate Generalized Boosted Regression models

Description

Usage

Arguments

Value

See Also

Examples

Related to fit_gbm in sjevelazco/flexsdm...

R Package Documentation

Browse R Packages

We want your feedback!

sjevelazco/flexsdm
Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models

fit_gbm: Fit and validate Generalized Boosted Regression models
In sjevelazco/flexsdm: Tools for Data Preparation, Fitting, Prediction, Evaluation, and Post-Processing of Species Distribution Models