| trainBrt | R Documentation |
This function is a wrapper for gbm.step. It returns the model with best combination of learning rate, tree depth, and bag fraction based on cross-validated deviance. It can also return a table with deviance of different combinations of tuning parameters that were tested, and all of the models tested. See Elith, J., J.R. Leathwick, and T. Hastie. 2008. A working guide to boosted regression trees. Journal of Animal Ecology 77:802-813.
trainBrt(
data,
resp = names(data)[1],
preds = names(data)[2:ncol(data)],
family = "bernoulli",
learningRate = c(1e-04, 0.001, 0.01),
treeComplexity = c(5, 3, 1),
bagFraction = 0.6,
minTrees = 1000,
maxTrees = 8000,
tries = 5,
tryBy = c("learningRate", "treeComplexity", "maxTrees", "stepSize"),
w = TRUE,
anyway = FALSE,
out = "model",
cores = 1,
verbose = FALSE,
...
)
data |
data frame with first column being response |
resp |
Character or integer. Name or column index of response variable. Default is to use the first column in |
preds |
Character list or integer list. Names of columns or column indices of predictors. Default is to use the second and subsequent columns in |
family |
Character. Name of error family. See |
learningRate |
Numeric. Learning rate at which model learns from successive trees (Elith et al. 2008 recommend 0.0001 to 0.1). |
treeComplexity |
Positive integer. Tree complexity: depth of branches in a single tree (1 to 16). |
bagFraction |
Numeric in the range [0, 1]. Bag fraction: proportion of data used for training in cross-validation (Elith et al. 2008 recommend 0.5 to 0.7). |
minTrees |
Positive integer. Minimum number of trees to be scored as a "usable" model (Elith et al. 2008 recommend at least 1000). Default is 1000. |
maxTrees |
Positive integer. Maximum number of trees in model set (same as parameter |
tries |
Integer > 0. Number of times to try to train a model with a particular set of tuning parameters. The function will stop training the first time a model converges (usually on the first attempt). Non-convergence seems to be related to the number of trees tried in each step. So if non-convergence occurs then the function automatically increases the number of trees in the step size until |
tryBy |
Character list. A list that contains one or more of |
w |
Either logical in which case |
anyway |
Logical. If |
out |
Character. Indicates type of value returned. If |
cores |
Integer >= 1. Number of cores to use when calculating multiple models. Default is 1. |
verbose |
Logical. If |
... |
Arguments to pass to |
If out = 'model' this function returns an object of class gbm. If out = 'tuning' this function returns a data frame with tuning parameters and cross-validation deviance for each model tried. If out = c('model', 'tuning' then it returns a list object with the gbm object and the data frame. Note that if a model does not converge or does not meet sufficiency criteria (i.e., the number of optimal trees is < minTrees, then the model is not returned (a NULL value is returned for 'model' and models are simply missing from the tuning and models output.
gbm.step
## Not run:
### model red-bellied lemurs
data(mad0)
data(lemurs)
# climate data
bios <- c(1, 5, 12, 15)
clim <- raster::getData('worldclim', var='bio', res=10)
clim <- raster::subset(clim, bios)
clim <- raster::crop(clim, mad0)
# occurrence data
occs <- lemurs[lemurs$species == 'Eulemur rubriventer', ]
occsEnv <- raster::extract(clim, occs[ , c('longitude', 'latitude')])
# background sites
bg <- 2000 # too few cells to locate 10000 background points
bgSites <- dismo::randomPoints(clim, 2000)
bgEnv <- raster::extract(clim, bgSites)
# collate
presBg <- rep(c(1, 0), c(nrow(occs), nrow(bgSites)))
env <- rbind(occsEnv, bgEnv)
env <- cbind(presBg, env)
env <- as.data.frame(env)
preds <- paste0('bio', bios)
# settings... defaults probably better, but these are faster
lr <- c(0.001, 0.1)
tc <- c(1, 3)
maxTrees <- 2000
set.seed(123)
model <- trainBrt(
data = env,
resp = 'presBg',
preds = preds,
learningRate = lr,
treeComplexity = tc,
maxTrees = maxTrees,
verbose = TRUE
)
plot(model)
# prediction raster
nTrees <- model$gbm.call$n.trees
map <- predict(clim, model, type='response', n.trees=nTrees)
plot(map)
points(occs[ , c('longitude', 'latitude')])
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.