Description Usage Arguments Details Value Author(s) References See Also Examples
Fits generalized boosted regression models. For technical details, see the
vignette: utils::browseVignettes("gbm")
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  gbm(
formula = formula(data),
distribution = "bernoulli",
data = list(),
weights,
var.monotone = NULL,
n.trees = 100,
interaction.depth = 1,
n.minobsinnode = 10,
shrinkage = 0.1,
bag.fraction = 0.5,
train.fraction = 1,
cv.folds = 0,
keep.data = TRUE,
verbose = FALSE,
class.stratify.cv = NULL,
n.cores = NULL
)

formula 
A symbolic description of the model to be fit. The formula
may include an offset term (e.g. y~offset(n)+x). If

distribution 
Either a character string specifying the name of the
distribution to use or a list with a component Currently available options are If quantile regression is specified, If If "pairwise" regression is specified,
Note that splitting of instances into training and validation sets follows
group boundaries and therefore only approximates the specified
Weights can be used in conjunction with pairwise metrics, however it is assumed that they are constant for instances from the same group. For details and background on the algorithm, see e.g. Burges (2010). 
data 
an optional data frame containing the variables in the model. By
default the variables are taken from 
weights 
an optional vector of weights to be used in the fitting
process. Must be positive but do not need to be normalized. If

var.monotone 
an optional vector, the same length as the number of predictors, indicating which variables have a monotone increasing (+1), decreasing (1), or arbitrary (0) relationship with the outcome. 
n.trees 
Integer specifying the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion. Default is 100. 
interaction.depth 
Integer specifying the maximum depth of each tree (i.e., the highest level of variable interactions allowed). A value of 1 implies an additive model, a value of 2 implies a model with up to 2way interactions, etc. Default is 1. 
n.minobsinnode 
Integer specifying the minimum number of observations in the terminal nodes of the trees. Note that this is the actual number of observations, not the total weight. 
shrinkage 
a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or stepsize reduction; 0.001 to 0.1 usually work, but a smaller learning rate typically requires more trees. Default is 0.1. 
bag.fraction 
the fraction of the training set observations randomly
selected to propose the next tree in the expansion. This introduces
randomnesses into the model fit. If 
train.fraction 
The first 
cv.folds 
Number of crossvalidation folds to perform. If

keep.data 
a logical variable indicating whether to keep the data and
an index of the data stored with the object. Keeping the data and index
makes subsequent calls to 
verbose 
Logical indicating whether or not to print out progress and
performance indicators ( 
class.stratify.cv 
Logical indicating whether or not the
crossvalidation should be stratified by class. Defaults to 
n.cores 
The number of CPU cores to use. The crossvalidation loop
will attempt to send different CV folds off to different cores. If

gbm.fit
provides the link between R and the C++ gbm engine.
gbm
is a frontend to gbm.fit
that uses the familiar R
modeling formulas. However, model.frame
is very slow if
there are many predictor variables. For powerusers with many variables use
gbm.fit
. For general practice gbm
is preferable.
This package implements the generalized boosted modeling framework. Boosting is the process of iteratively adding basis functions in a greedy fashion so that each additional basis function further reduces the selected loss function. This implementation closely follows Friedman's Gradient Boosting Machine (Friedman, 2001).
In addition to many of the features documented in the Gradient Boosting
Machine, gbm
offers additional features including the outofbag
estimator for the optimal number of iterations, the ability to store and
manipulate the resulting gbm
object, and a variety of other loss
functions that had not previously had associated boosting algorithms,
including the Cox partial likelihood for censored data, the poisson
likelihood for count outcomes, and a gradient boosting implementation to
minimize the AdaBoost exponential loss function.
A gbm.object
object.
Greg Ridgeway gregridgeway@gmail.com
Quantile regression code developed by Brian Kriegler bk@stat.ucla.edu
tdistribution, and multinomial code developed by Harry Southworth and Daniel Edwards
Pairwise code developed by Stefan Schroedl schroedl@a9.com
Y. Freund and R.E. Schapire (1997) “A decisiontheoretic generalization of online learning and an application to boosting,” Journal of Computer and System Sciences, 55(1):119139.
G. Ridgeway (1999). “The state of boosting,” Computing Science and Statistics 31:172181.
J.H. Friedman, T. Hastie, R. Tibshirani (2000). “Additive Logistic Regression: a Statistical View of Boosting,” Annals of Statistics 28(2):337374.
J.H. Friedman (2001). “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics 29(5):11891232.
J.H. Friedman (2002). “Stochastic Gradient Boosting,” Computational Statistics and Data Analysis 38(4):367378.
B. Kriegler (2007). CostSensitive Stochastic Gradient Boosting Within a Quantitative Regression Framework. Ph.D. Dissertation. University of California at Los Angeles, Los Angeles, CA, USA. Advisor(s) Richard A. Berk. urlhttps://dl.acm.org/citation.cfm?id=1354603.
C. Burges (2010). “From RankNet to LambdaRank to LambdaMART: An Overview,” Microsoft Research Technical Report MSRTR201082.
gbm.object
, gbm.perf
,
plot.gbm
, predict.gbm
, summary.gbm
,
and pretty.gbm.tree
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91  #
# A least squares regression example
#
# Simulate data
set.seed(101) # for reproducibility
N < 1000
X1 < runif(N)
X2 < 2 * runif(N)
X3 < ordered(sample(letters[1:4], N, replace = TRUE), levels = letters[4:1])
X4 < factor(sample(letters[1:6], N, replace = TRUE))
X5 < factor(sample(letters[1:3], N, replace = TRUE))
X6 < 3 * runif(N)
mu < c(1, 0, 1, 2)[as.numeric(X3)]
SNR < 10 # signaltonoise ratio
Y < X1 ^ 1.5 + 2 * (X2 ^ 0.5) + mu
sigma < sqrt(var(Y) / SNR)
Y < Y + rnorm(N, 0, sigma)
X1[sample(1:N,size=500)] < NA # introduce some missing values
X4[sample(1:N,size=300)] < NA # introduce some missing values
data < data.frame(Y, X1, X2, X3, X4, X5, X6)
# Fit a GBM
set.seed(102) # for reproducibility
gbm1 < gbm(Y ~ ., data = data, var.monotone = c(0, 0, 0, 0, 0, 0),
distribution = "gaussian", n.trees = 100, shrinkage = 0.1,
interaction.depth = 3, bag.fraction = 0.5, train.fraction = 0.5,
n.minobsinnode = 10, cv.folds = 5, keep.data = TRUE,
verbose = FALSE, n.cores = 1)
# Check performance using the outofbag (OOB) error; the OOB error typically
# underestimates the optimal number of iterations
best.iter < gbm.perf(gbm1, method = "OOB")
print(best.iter)
# Check performance using the 50% heldout test set
best.iter < gbm.perf(gbm1, method = "test")
print(best.iter)
# Check performance using 5fold crossvalidation
best.iter < gbm.perf(gbm1, method = "cv")
print(best.iter)
# Plot relative influence of each variable
par(mfrow = c(1, 2))
summary(gbm1, n.trees = 1) # using first tree
summary(gbm1, n.trees = best.iter) # using estimated best number of trees
# Compactly print the first and last trees for curiosity
print(pretty.gbm.tree(gbm1, i.tree = 1))
print(pretty.gbm.tree(gbm1, i.tree = gbm1$n.trees))
# Simulate new data
set.seed(103) # for reproducibility
N < 1000
X1 < runif(N)
X2 < 2 * runif(N)
X3 < ordered(sample(letters[1:4], N, replace = TRUE))
X4 < factor(sample(letters[1:6], N, replace = TRUE))
X5 < factor(sample(letters[1:3], N, replace = TRUE))
X6 < 3 * runif(N)
mu < c(1, 0, 1, 2)[as.numeric(X3)]
Y < X1 ^ 1.5 + 2 * (X2 ^ 0.5) + mu + rnorm(N, 0, sigma)
data2 < data.frame(Y, X1, X2, X3, X4, X5, X6)
# Predict on the new data using the "best" number of trees; by default,
# predictions will be on the link scale
Yhat < predict(gbm1, newdata = data2, n.trees = best.iter, type = "link")
# least squares error
print(sum((data2$Y  Yhat)^2))
# Construct univariate partial dependence plots
plot(gbm1, i.var = 1, n.trees = best.iter)
plot(gbm1, i.var = 2, n.trees = best.iter)
plot(gbm1, i.var = "X3", n.trees = best.iter) # can use index or name
# Construct bivariate partial dependence plots
plot(gbm1, i.var = 1:2, n.trees = best.iter)
plot(gbm1, i.var = c("X2", "X3"), n.trees = best.iter)
plot(gbm1, i.var = 3:4, n.trees = best.iter)
# Construct trivariate partial dependence plots
plot(gbm1, i.var = c(1, 2, 6), n.trees = best.iter,
continuous.resolution = 20)
plot(gbm1, i.var = 1:3, n.trees = best.iter)
plot(gbm1, i.var = 2:4, n.trees = best.iter)
plot(gbm1, i.var = 3:5, n.trees = best.iter)
# Add more (i.e., 100) boosting iterations to the ensemble
gbm2 < gbm.more(gbm1, n.new.trees = 100, verbose = FALSE)

Loaded gbm 2.1.8
CV: 1
CV: 2
CV: 3
CV: 4
CV: 5
OOB generally underestimates the optimal number of iterations although predictive performance is reasonably competitive. Using cv_folds>1 when calling gbm usually results in improved predictive performance.
[1] 43
attr(,"smoother")
Call:
loess(formula = object$oobag.improve ~ x, enp.target = min(max(4,
length(x)/10), 50))
Number of Observations: 100
Equivalent Number of Parameters: 8.32
Residual Standard Error: 0.005885
[1] 63
[1] 70
var rel.inf
X3 X3 86.74183
X2 X2 13.25817
X1 X1 0.00000
X4 X4 0.00000
X5 X5 0.00000
X6 X6 0.00000
var rel.inf
X3 X3 68.6902290
X2 X2 25.3361275
X1 X1 3.3863779
X4 X4 1.2681014
X6 X6 1.0914539
X5 X5 0.2277102
SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight
0 2 1.500000000 1 5 9 258.70575 250
1 1 0.643849456 2 3 4 45.93610 122
2 1 0.204032200 1 1 1 0.00000 34
3 1 0.067172308 1 1 1 0.00000 88
4 1 0.105313590 1 1 1 0.00000 122
5 2 2.500000000 6 7 8 41.83219 128
6 1 0.044497873 1 1 1 0.00000 68
7 1 0.159057141 1 1 1 0.00000 60
8 1 0.098197530 1 1 1 0.00000 128
9 1 0.001115896 1 1 1 0.00000 250
Prediction
0 0.001115896
1 0.105313590
2 0.204032200
3 0.067172308
4 0.105313590
5 0.098197530
6 0.044497873
7 0.159057141
8 0.098197530
9 0.001115896
SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight
0 0 0.582418997 1 2 3 0.9193604 250
1 1 0.006219724 1 1 1 0.0000000 63
2 1 0.010214328 1 1 1 0.0000000 57
3 3 51.000000000 4 8 9 0.7852138 130
4 1 1.016134987 5 6 7 0.5866318 62
5 1 0.003627100 1 1 1 0.0000000 33
6 1 0.015867874 1 1 1 0.0000000 29
7 1 0.005491517 1 1 1 0.0000000 62
8 1 0.010784760 1 1 1 0.0000000 28
9 1 0.009523251 1 1 1 0.0000000 40
Prediction
0 0.0006082206
1 0.0062197235
2 0.0102143276
3 0.0026340711
4 0.0054915172
5 0.0036270997
6 0.0158678743
7 0.0054915172
8 0.0107847602
9 0.0095232507
[1] 5153.285
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.