gamboost: Gradient Boosting with Smooth Components

Description Usage Arguments Details Value References See Also Examples

View source: R/mboost.R

Description

Gradient boosting for optimizing arbitrary loss functions, where component-wise smoothing procedures are utilized as base-learners.

Usage

1
2
3
gamboost(formula, data = list(),
         baselearner = c("bbs", "bols", "btree", "bss", "bns"),
         dfbase = 4, ...)

Arguments

formula

a symbolic description of the model to be fit.

data

a data frame containing the variables in the model.

baselearner

a character specifying the component-wise base learner to be used: bbs means P-splines with a B-spline basis (see Schmid and Hothorn 2008), bols linear models and btree boosts stumps. bss and bns are deprecated. Component-wise smoothing splines have been considered in Buehlmann and Yu (2003) and Schmid and Hothorn (2008) investigate P-splines with a B-spline basis. Kneib, Hothorn and Tutz (2009) also utilize P-splines with a B-spline basis, supplement them with their bivariate tensor product version to estimate interaction surfaces and spatial effects and also consider random effects base learners.

dfbase

an integer vector giving the degrees of freedom for the smoothing spline, either globally for all variables (when its length is one) or separately for each single covariate.

...

additional arguments passed to mboost_fit, including weights, offset, family and control. For default values see mboost_fit.

Details

A (generalized) additive model is fitted using a boosting algorithm based on component-wise univariate base-learners. The base-learners can either be specified via the formula object or via the baselearner argument (see bbs for an example). If the base-learners specified in formula differ from baselearner, the latter argument will be ignored. Furthermore, two additional base-learners can be specified in formula: bspatial for bivariate tensor product penalized splines and brandom for random effects.

Value

An object of class mboost with print, AIC, plot and predict methods being available.

References

Peter Buehlmann and Bin Yu (2003), Boosting with the L2 loss: regression and classification. Journal of the American Statistical Association, 98, 324–339.

Peter Buehlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.

Thomas Kneib, Torsten Hothorn and Gerhard Tutz (2009), Variable selection and model choice in geoadditive regression models, Biometrics, 65(2), 626–634.

Matthias Schmid and Torsten Hothorn (2008), Boosting additive models using component-wise P-splines as base-learners. Computational Statistics \& Data Analysis, 53(2), 298–311.

Torsten Hothorn, Peter Buehlmann, Thomas Kneib, Mattthias Schmid and Benjamin Hofner (2010), Model-based Boosting 2.0. Journal of Machine Learning Research, 11, 2109 – 2113.

Benjamin Hofner, Andreas Mayr, Nikolay Robinzonov and Matthias Schmid (2014). Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost. Computational Statistics, 29, 3–35.
http://dx.doi.org/10.1007/s00180-012-0382-5

Available as vignette via: vignette(package = "mboost", "mboost_tutorial")

See Also

mboost for the generic boosting function and glmboost for boosted linear models and blackboost for boosted trees. See e.g. bbs for possible base-learners. See cvrisk for cross-validated stopping iteration. Furthermore see boost_control, Family and methods.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    ### a simple two-dimensional example: cars data
    cars.gb <- gamboost(dist ~ speed, data = cars, dfbase = 4,
                        control = boost_control(mstop = 50))
    cars.gb
    AIC(cars.gb, method = "corrected")

    ### plot fit for mstop = 1, ..., 50
    plot(dist ~ speed, data = cars)
    tmp <- sapply(1:mstop(AIC(cars.gb)), function(i)
        lines(cars$speed, predict(cars.gb[i]), col = "red"))
    lines(cars$speed, predict(smooth.spline(cars$speed, cars$dist),
                              cars$speed)$y, col = "green")

    ### artificial example: sinus transformation
    x <- sort(runif(100)) * 10
    y <- sin(x) + rnorm(length(x), sd = 0.25)
    plot(x, y)
    ### linear model
    lines(x, fitted(lm(y ~ sin(x) - 1)), col = "red")
    ### GAM
    lines(x, fitted(gamboost(y ~ x,
                    control = boost_control(mstop = 500))),
          col = "green")

mboost documentation built on May 2, 2019, 6:10 p.m.

Related to gamboost in mboost...