singboost: SingBoost Boosting method

Description Usage Arguments Details Value References Examples

View source: R/singboost.R

Description

SingBoost is a Boosting method that can deal with complicated loss functions that do not allow for a gradient. SingBoost is based on L2-Boosting in its current implementation.

Usage

1
2
3
4
5
6
7
8
9
singboost(
  D,
  M = 10,
  m_iter = 100,
  kap = 0.1,
  singfamily = Gaussian(),
  best = 1,
  LS = FALSE
)

Arguments

D

Data matrix. Has to be an n \times (p+1)-dimensional data frame in the format (X,Y). The X-part must not contain an intercept column containing only ones since this column will be added automatically.

M

An integer between 2 and m_iter. Indicates that in every M-th iteration, a singular iteration will be performed. Default is 10.

m_iter

Number of SingBoost iterations. Default is 100.

kap

Learning rate (step size). Must be a real number in ]0,1]. Default is 0.1 It is recommended to use a value smaller than 0.5.

singfamily

A Boosting family corresponding to the target loss function. See .mboost for families corresponding to standard loss functions. May also use the loss functions for ranking losses provided in this package. Default is Gaussian() for which SingBoost is just standard L_2-Boosting.

best

Needed in the case of localized ranking. The parameter K of the localized ranking loss will be computed by best \cdot n (rounded to the next larger integer). Warning: If a parameter K is inserted into the LocRank family, it will be ignored when executing SingBoost.

LS

If a singfamily object that is already provided by mboost is used, the respective Boosting algorithm will be performed in the singular iterations if Ls is set to TRUE. Default is FALSE.

Details

Gradient Boosting algorithms require convexity and differentiability of the underlying loss function. SingBoost is a Boosting algorithm based on L_2-Boosting that allows for complicated loss functions that do not need to satisfy these requirements. In fact, SingBoost alternates between standard L_2-Boosting iterations and singular iterations where essentially an empirical gradient step is executed in the sense that the baselearner that performs best, evaluated in the complicated loss, is selected in the respective iteration. The implementation is based on glmboost from the package mboost and using the L_2-loss in the singular iterations returns exactly the same coefficients as L_2-Boosting.

Value

Selected variables

Names of the selected variables.

Coefficients

The selected coefficients as an (p+1)-dimensional vector (i.e., including the zeroes).

Freqs

Selection frequencies and a matrix for intercept and coefficient paths, respectively.

VarCoef

Vector of the non-zero coefficients.

References

Werner, T., Gradient-Free Gradient Boosting, PhD Thesis, Carl von Ossietzky University Oldenburg, 2020

P. Bühlmann and B. Yu. Boosting with the l2 loss: Regression and Classification. Journal of the American Statistical Association, 98(462):324–339, 2003

T. Hothorn, P. Bühlmann, T. Kneib, M. Schmid, and B. Hofner. mboost: Model-Based Boosting, 2017

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
{glmres<-glmboost(Sepal.Length~.,iris)
glmres
attributes(varimp(glmres))$self
attributes(varimp(glmres))$var
firis<-as.formula(Sepal.Length~.)
Xiris<-model.matrix(firis,iris)
Diris<-data.frame(Xiris[,-1],iris$Sepal.Length)
colnames(Diris)[6]<-"Y"
coef(glmboost(Xiris,iris$Sepal.Length))
singboost(Diris)
singboost(Diris,LS=TRUE)}
{glmres2<-glmboost(Sepal.Length~Petal.Length+Sepal.Width:Species,iris)
finter<-as.formula(Sepal.Length~Petal.Length+Sepal.Width:Species-1)
Xinter<-model.matrix(finter,iris)
Dinter<-data.frame(Xinter,iris$Sepal.Length)
singboost(Dinter)
coef(glmres2)}
{glmres3<-glmboost(Xiris,iris$Sepal.Length,control=boost_control(mstop=250,nu=0.05))
coef(glmres3)
attributes(varimp(glmres3))$self
singboost(Diris,m_iter=250,kap=0.05)
singboost(Diris,LS=TRUE,m_iter=250,kap=0.05)}
{glmquant<-glmboost(Sepal.Length~.,iris,family=QuantReg(tau=0.75))
coef(glmquant)
attributes(varimp(glmquant))$self
singboost(Diris,singfamily=QuantReg(tau=0.75),LS=TRUE)
singboost(Diris,singfamily=QuantReg(tau=0.75),LS=TRUE,M=2)}
{singboost(Diris,singfamily=Rank(),LS=TRUE)
singboost(Diris,singfamily=Rank(),LS=TRUE,M=2)}

gfboost documentation built on Jan. 7, 2022, 5:06 p.m.