runBRT: Run a boosted regression tree model using Sam's default...
In SEEG-Oxford/seegSDM: Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

Description Usage Arguments Value See Also Examples

A wrapper to run a BRT model using gbm.step or gbm with or without selecting the op[timal number of trees using gbm.perf with parameter settings used in Bhatt et al. (2013). Covariate effect curves, relative influences and a prediction map on the probability scale are returned. A function to define regression weights can be specified through wt.fun.

BRT models sometimes fail to converge and the gbm.step implementation fails silently, returning NULL. If method = 'step', runBRT instead attempts to run the procedure max_tries times and fails with an error if it still hasn't converged.

To run a BRT model without optimising the number of trees you can set method = 'gbm' witha reasonable number of trees in n.trees, which should be much faster.

At present, only method = 'step' returns a model from which full validation statistics can be extracted.

runBRT(data,
       gbm.x,
       gbm.y,
       pred.raster = NULL,
       gbm.coords = NULL,
       wt = NULL,
       max_tries = 5,
       verbose = FALSE,
       tree.complexity = 4,
       learning.rate = 0.005,
       bag.fraction = 0.75,
       n.trees = 10,
       n.folds = 10,
       max.trees = 10000,
       step.size = 10,
       method = c('step', 'perf', 'gbm'),
       family = 'bernoulli',
       gbm.offset = NULL,
       ...)

`data`	Input dataframe.
`gbm.x`	Index for columns containing covariate values.
`gbm.y`	Index for column containing presence/absence code (1s or 0s).
`pred.raster`	An optional `RasterBrick` or `RasterStack` object to predict the model to.
`gbm.coords`	Optional index for two columns (longitude then latitude) containing coordinates of records. This is required if you later want to calculate validation statistics using pair-wise distance sampling (setting `pwd = TRUE` in `getStats`). Set to `NULL` (the default) if not required.
`wt`	An optional vector of regression weights, an index for a column giving regression weights or a function to create the weights from the presence/absence column. The default (`wt = NULL`) applies full weight to each record. If a function is specified, it must take a vector of 1s and 0s as input and return a vector of the same length giving regression weights. To apply a 50:50 weighting of presence and absence records (mimicking a prevalence of 0.5) use: `wt = function(PA) ifelse(PA == 1, 1, sum(PA) / sum(1 - PA))`.
`max_tries`	How many time to try and get gbm.step to converge before throwing an error.
`verbose`	Passed to `gbm.step`, whether to report on progress.
`tree.complexity`	Passed to `gbm.step`, number of bifurcations in each individual tree.
`learning.rate`	Passed to `gbm.step`, how small to shrink the contribution of each tree in the final model
`bag.fraction`	Passed to `gbm.step`, proportion of datapoints used in selecting variables
`n.trees`	Passed to `gbm.step`, initial number of trees to fit. `gbm.step` optimises this parameter.
`n.folds`	Passed to `gbm.step`, number of folds in each round of cross validation.
`max.trees`	Passed to `gbm.step`, maximum number of trees to fit before stopping the stepping algorithm.
`step.size`	Passed to `gbm.step`, number of trees to add at each iteration.
`method`	Whether to run the model using the `gbm.step` procedure (`method = 'step'`) to automatically detect the number of trees (the default), the `gbm.perf` procedure using cross-validation post-hoc `method = 'perf'` (much faster) or a simple `gbm` model with a the number of trees fixed at coden.trees `method = 'gbm'` (even faster, but potentially less accurate). Both `'step'` amd `'perf'` will fit up to a maximum of`max.trees` trees.
`family`	The probability distribution for the likelihood, passed to either the `family` argument of `gbm.step` (if `method = 'step'`) or the `distribution` argument of `gbm` (if `method = 'perf'` or `method = 'gbm'`).
`gbm.offset`	If `family = 'poisson'`, `gbm.offset` can be used to specify a column of `data` giving an offset, passed as the `offset` argument to either `gbm` or `gbm.step`, (depending on `method`).
`...`	Additional functions to pass to `gbm.step`.

A list containing four elements

`model`	the fitted gbm model
`effects`	a list of effect curves with one element ofr each covariate
`relinf`	a vector of relative influence estimates for each covariate
`pred`	a `RasterLayer` giving predictions on the probability scale (or `NULL` if `pred.raster = NULL`)
`coords`	a dataframe giving the coordinates of the training points (or `NULL` if `gbm.coords = NULL`)

gbm.step, getRelInf, getEffectPlots, combinePreds

# load the data
data(occurrence)

# load the covariate rasters
data(covariates)

# load evidence consensus layer
data(consensus)

background <- bgSample(consensus,
                       n= 100,
                       replace=FALSE,
                       spatial=FALSE)

colnames(background) <- c('Longitude', 'Latitude')
background <- data.frame(background)

# combine the occurrence and background records
dat <- rbind(cbind(PA = rep(1, nrow(occurrence)),
                   occurrence[, c('Longitude', 'Latitude')]),
             cbind(PA = rep(0, nrow(background)),
                   background[ ,c('Longitude', 'Latitude')]))

# extract covariate values for each data point
dat_covs <- extract(covariates, dat[, c('Longitude', 'Latitude')])

# combine covariates with the other info
dat_all <- cbind(dat, dat_covs)

model <- runBRT(dat_all,
                gbm.x = 4:6,
                gbm.y = 1,
                n.folds = 5)

SEEG-Oxford/seegSDM documentation built on May 9, 2019, 11:08 a.m.

SEEG-Oxford/seegSDM index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SEEG-Oxford/seegSDM
Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

runBRT: Run a boosted regression tree model using Sam's default...
In SEEG-Oxford/seegSDM: Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

Description

Usage

Arguments

Value

See Also

Examples

Related to runBRT in SEEG-Oxford/seegSDM...

R Package Documentation

Browse R Packages

We want your feedback!

SEEG-Oxford/seegSDM Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

runBRT: Run a boosted regression tree model using Sam's default... In SEEG-Oxford/seegSDM: Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

Description

Usage

Arguments

Value

See Also

Examples

Related to runBRT in SEEG-Oxford/seegSDM...

R Package Documentation

Browse R Packages

We want your feedback!

SEEG-Oxford/seegSDM
Streamlined Functions for Species Distribution Modelling in the SEEG Research Group

runBRT: Run a boosted regression tree model using Sam's default...
In SEEG-Oxford/seegSDM: Streamlined Functions for Species Distribution Modelling in the SEEG Research Group