bst | R Documentation |
Gradient boosting for optimizing loss functions with componentwise linear, smoothing splines, tree models as base learners.
bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo", "poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC", "huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM", "glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'bst' print(x, ...) ## S3 method for class 'bst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "all.res", "class", "loss", "error"), ...) ## S3 method for class 'bst' plot(x, type = c("step", "norm"),...) ## S3 method for class 'bst' coef(object, which=object$ctrl$mstop, ...) ## S3 method for class 'bst' fpartial(object, mstop=NULL, newdata=NULL)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
A variety of loss functions.
|
ctrl |
an object of class |
type |
type of prediction or plot, see |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
which |
at which boosting |
... |
additional arguments. |
Boosting algorithms for classification and regression problems. In a classification problem, suppose f is a classifier for a response y. A cost-sensitive or weighted loss function is
L(y,f,cost)=l(y,f,cost)max(0, (1-yf)).
For family="hinge"
,
l(y,f,cost)= 1-cost, if y = +1; = cost, if y = -1.
For family="hinge2"
,
l(y,f,cost)= 1, if y = +1 and f > 0 ; = 1-cost, if y = +1 and f < 0; = cost, if y = -1 and f > 0; = 1, if y = -1 and f < 0.
For twin boosting if twinboost=TRUE
, there are two types of adaptive boosting if learner="ls"
: for twintype=1
, weights are based on coefficients in the first round of boosting; for twintype=2
, weights are based on predictions in the first round of boosting. See Buehlmann and Hothorn (2010).
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x, y, cost, family, learner, control.tree, maxdepth |
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Zhu Wang
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Peter Buehlmann and Torsten Hothorn (2010), Twin Boosting: improved feature selection and prediction, Statistics and Computing, 20, 119-138.
cv.bst
for cross-validated stopping iteration. Furthermore see
bst_control
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.