Description Usage Arguments Details Value References Examples
Executes CMB and the loss-based Stability Selection.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
Dtrain |
Data matrix. Has to be an n \times (p+1)-dimensional data frame in the format (X,Y). The X-part must not contain an intercept column containing only ones since this column will be added automatically. |
nsing |
Number of observations (rows) used for the SingBoost submodels. |
Bsing |
Number of subsamples based on which the SingBoost models are validated. Default is 1. Not to confuse with parameter |
B |
Number of subsamples based on which the CMB models are validated. Default is 100. Not to confuse with |
alpha |
Optional real number in ]0,1]. Defines the fraction of best SingBoost models used in the aggregation step. Default is 1 (use all models). |
singfam |
A SingBoost family. The SingBoost models are trained based on the corresponding loss function. Default is |
evalfam |
A SingBoost family. The SingBoost models are validated according to the corresponding loss function. Default is |
sing |
If |
M |
An integer between 2 and |
m_iter |
Number of SingBoost iterations. Default is 100. |
kap |
Learning rate (step size). Must be a real number in ]0,1]. Default is 0.1 It is recommended to use a value smaller than 0.5. |
LS |
If a |
best |
Needed in the case of localized ranking. The parameter |
wagg |
Type of row weight aggregation. |
gridtype |
Choose between |
grid |
The grid for the thresholds (in ]0,1]) or the numbers of final variables (positive integers). |
Dvalid |
Validation data for selecting the optimal element of the grid and with it the best corresponding model. |
ncmb |
Number of samples used for |
robagg |
Optional. If setting |
lower |
Optional argument. Only reasonable when setting |
singcoef |
Default is |
Mfinal |
Optional. Necessary if |
... |
Optional further arguments |
See CMB
and CMB.Stabsel
.
Final coefficients |
The coefficients corresponding to the optimal stable model as a vector. |
Stable column measure |
Aggregated empirical column measure (i.e., selection frequencies) as a vector. |
Selected columns |
The column numbers of the variables that form the best stable model as a vector. |
Used row measure |
Aggregated empirical row measure (i.e., row weights) as a vector. |
Werner, T., Gradient-Free Gradient Boosting, PhD Thesis, Carl von Ossietzky University Oldenburg, 2020
T. Hothorn, P. Bühlmann, T. Kneib, M. Schmid, and B. Hofner. mboost: Model-Based Boosting, 2017
B. Hofner and T. Hothorn. stabs: Stability Selection with Error Control, 2017.
B. Hofner, L. Boccuto, and M. Göker. Controlling false discoveries in high-dimensional situations: Boosting with stability selection. BMC Bioinformatics, 16(1):144, 2015.
N. Meinshausen and P. Bühlmann. Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4):417–473, 2010.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | firis<-as.formula(Sepal.Length~.)
Xiris<-model.matrix(firis,iris)
Diris<-data.frame(Xiris[,-1],iris$Sepal.Length)
colnames(Diris)[6]<-"Y"
set.seed(19931023)
ind<-sample(1:150,120,replace=FALSE)
Dtrain<-Diris[ind,]
Dvalid<-Diris[-ind,]
set.seed(19931023)
cmb3s<-CMB3S(Dtrain,nsing=120,Dvalid=Dvalid,ncmb=120,Bsing=1,B=1,alpha=1,singfam=Gaussian()
,evalfam=Gaussian(),sing=FALSE,M=10,m_iter=100,kap=0.1,LS=FALSE,wagg='weights1',
gridtype='pigrid',grid=seq(0.8,0.9,1),robagg=FALSE,lower=0,singcoef=TRUE,Mfinal=10)
cmb3s$Fin
cmb3s$Stab
cmb3s$Sel
glmres4<-glmboost(Sepal.Length~.,iris[ind,])
coef(glmres4)
set.seed(19931023)
cmb3s1<-CMB3S(Dtrain,nsing=80,Dvalid=Dvalid,ncmb=100,Bsing=10,B=100,alpha=0.5,singfam=Gaussian(),
evalfam=Gaussian(),sing=FALSE,M=10,m_iter=100,kap=0.1,LS=FALSE,wagg='weights1',gridtype='pigrid',
grid=seq(0.8,0.9,1),robagg=FALSE,lower=0,singcoef=TRUE,Mfinal=10)
cmb3s1$Fin
cmb3s1$Stab
## This will may take around a minute
set.seed(19931023)
cmb3s2<-CMB3S(Dtrain,nsing=80,Dvalid=Dvalid,ncmb=100,Bsing=10,B=100,alpha=0.5,singfam=Rank(),
evalfam=Rank(),sing=TRUE,M=10,m_iter=100,kap=0.1,LS=TRUE,wagg='weights2',gridtype='pigrid',
grid=seq(0.8,0.9,1),robagg=FALSE,lower=0,singcoef=TRUE,Mfinal=10)
cmb3s2$Fin
cmb3s2$Stab
set.seed(19931023)
cmb3s3<-CMB3S(Dtrain,nsing=80,Dvalid=Dvalid,ncmb=100,Bsing=10,B=100,alpha=0.5,singfam=Huber(),
evalfam=Huber(),sing=FALSE,M=10,m_iter=100,kap=0.1,LS=FALSE,wagg='weights2',gridtype='pigrid',
grid=seq(0.8,0.9,1),robagg=FALSE,lower=0,singcoef=FALSE,Mfinal=10)
cmb3s3$Fin
cmb3s3$Stab
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.