trainBrt | R Documentation |
This function is a wrapper for gbm.step
. It returns the model with best combination of learning rate, tree depth, and bag fraction based on cross-validated deviance. It can also return a table with deviance of different combinations of tuning parameters that were tested, and all of the models tested. See Elith, J., J.R. Leathwick, and T. Hastie. 2008. A working guide to boosted regression trees. Journal of Animal Ecology 77:802-813.
trainBrt( data, resp = names(data)[1], preds = names(data)[2:ncol(data)], family = "bernoulli", learningRate = c(1e-04, 0.001, 0.01), treeComplexity = c(5, 3, 1), bagFraction = 0.6, minTrees = 1000, maxTrees = 8000, tries = 5, tryBy = c("learningRate", "treeComplexity", "maxTrees", "stepSize"), w = TRUE, anyway = FALSE, out = "model", cores = 1, verbose = FALSE, ... )
data |
data frame with first column being response |
resp |
Character or integer. Name or column index of response variable. Default is to use the first column in |
preds |
Character list or integer list. Names of columns or column indices of predictors. Default is to use the second and subsequent columns in |
family |
Character. Name of error family. See |
learningRate |
Numeric. Learning rate at which model learns from successive trees (Elith et al. 2008 recommend 0.0001 to 0.1). |
treeComplexity |
Positive integer. Tree complexity: depth of branches in a single tree (1 to 16). |
bagFraction |
Numeric in the range [0, 1]. Bag fraction: proportion of data used for training in cross-validation (Elith et al. 2008 recommend 0.5 to 0.7). |
minTrees |
Positive integer. Minimum number of trees to be scored as a "usable" model (Elith et al. 2008 recommend at least 1000). Default is 1000. |
maxTrees |
Positive integer. Maximum number of trees in model set (same as parameter |
tries |
Integer > 0. Number of times to try to train a model with a particular set of tuning parameters. The function will stop training the first time a model converges (usually on the first attempt). Non-convergence seems to be related to the number of trees tried in each step. So if non-convergence occurs then the function automatically increases the number of trees in the step size until |
tryBy |
Character list. A list that contains one or more of |
w |
Either logical in which case |
anyway |
Logical. If |
out |
Character. Indicates type of value returned. If |
cores |
Integer >= 1. Number of cores to use when calculating multiple models. Default is 1. |
verbose |
Logical. If |
... |
Arguments to pass to |
If out = 'model'
this function returns an object of class gbm
. If out = 'tuning'
this function returns a data frame with tuning parameters and cross-validation deviance for each model tried. If out = c('model', 'tuning'
then it returns a list object with the gbm
object and the data frame. Note that if a model does not converge or does not meet sufficiency criteria (i.e., the number of optimal trees is < minTrees
, then the model is not returned (a NULL
value is returned for 'model'
and models are simply missing from the tuning
and models
output.
gbm.step
## Not run: ### model red-bellied lemurs data(mad0) data(lemurs) # climate data bios <- c(1, 5, 12, 15) clim <- raster::getData('worldclim', var='bio', res=10) clim <- raster::subset(clim, bios) clim <- raster::crop(clim, mad0) # occurrence data occs <- lemurs[lemurs$species == 'Eulemur rubriventer', ] occsEnv <- raster::extract(clim, occs[ , c('longitude', 'latitude')]) # background sites bg <- 2000 # too few cells to locate 10000 background points bgSites <- dismo::randomPoints(clim, 2000) bgEnv <- raster::extract(clim, bgSites) # collate presBg <- rep(c(1, 0), c(nrow(occs), nrow(bgSites))) env <- rbind(occsEnv, bgEnv) env <- cbind(presBg, env) env <- as.data.frame(env) preds <- paste0('bio', bios) # settings... defaults probably better, but these are faster lr <- c(0.001, 0.1) tc <- c(1, 3) maxTrees <- 2000 set.seed(123) model <- trainBrt( data = env, resp = 'presBg', preds = preds, learningRate = lr, treeComplexity = tc, maxTrees = maxTrees, verbose = TRUE ) plot(model) # prediction raster nTrees <- model$gbm.call$n.trees map <- predict(clim, model, type='response', n.trees=nTrees) plot(map) points(occs[ , c('longitude', 'latitude')]) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.