Boosting for Classification and Regression
Description
Gradient boosting for optimizing loss functions with componentwise linear, smoothing splines, tree models as base learners.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo",
"poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC",
"huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM",
"glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1),
learner = c("ls", "sm", "tree"))
## S3 method for class 'bst'
print(x, ...)
## S3 method for class 'bst'
predict(object, newdata=NULL, newy=NULL, mstop=NULL,
type=c("response", "all.res", "class", "loss", "error"), ...)
## S3 method for class 'bst'
plot(x, type = c("step", "norm"),...)
## S3 method for class 'bst'
coef(object, which=object$ctrl$mstop, ...)
## S3 method for class 'bst'
fpartial(object, mstop=NULL, newdata=NULL)

Arguments
x 
a data frame containing the variables in the model. 
y 
vector of responses. 
cost 
price to pay for false positive, 0 < 
family 
A variety of loss functions.

ctrl 
an object of class 
type 
type of prediction or plot, see 
control.tree 
control parameters of rpart. 
learner 
a character specifying the componentwise base learner to be used:

object 
class of 
newdata 
new data for prediction with the same number of columns as 
newy 
new response. 
mstop 
boosting iteration for prediction. 
which 
at which boosting 
... 
additional arguments. 
Details
Boosting algorithms for classification and regression problems. In a classification problem, suppose f is a classifier for a reponse y. A costsensitive or weighted loss function is
L(y,f,cost)=l(y,f,cost)max(0, (1yf)).
For family="hinge"
,
l(y,f,cost)= 1cost, if y = +1; = cost, if y = 1.
For family="hinge2"
,
l(y,f,cost)= 1, if y = +1 and f > 0 ; = 1cost, if y = +1 and f < 0; = cost, if y = 1 and f > 0; = 1, if y = 1 and f < 0.
For twin boosting if twinboost=TRUE
, there are two types of adaptive boosting if learner="ls"
: for twintype=1
, weights are based on coefficients in the first round of boosting; for twintype=2
, weights are based on predictions in the first round of boosting. See Buehlmann and Hothorn (2010).
Value
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x, y, cost, family, learner, control.tree, maxdepth 
These are input variables and parameters 
ctrl 
the input 
yhat 
predicted function estimates 
ens 
a list of length 
ml.fit 
the last element of 
ensemble 
a vector of length 
xselect 
selected variables in 
coef 
estimated coefficients in each iteration. Used internally only 
Author(s)
Zhu Wang
References
Zhu Wang (2011), HingeBoost: ROCBased Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Peter Buehlmann and Torsten Hothorn (2010), Twin Boosting: improved feature selection and prediction, Statistics and Computing, 20, 119138.
See Also
cv.bst
for crossvalidated stopping iteration. Furthermore see
bst_control
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13  x < matrix(rnorm(100*5),ncol=5)
c < 2*x[,1]
p < exp(c)/(exp(c)+exp(c))
y < rbinom(100,1,p)
y[y != 1] < 1
x < as.data.frame(x)
dat.m < bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls")
predict(dat.m)
dat.m1 < bst(x, y, ctrl = bst_control(twinboost=TRUE,
coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50))
dat.m2 < rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE),
rfamily = "thinge", learner = "ls")
predict(dat.m2)
