Description Usage Arguments Details Value References
Cross-validates the whole loss-based Stability Selection by aggregating several stable models according to their performance on validation sets. Also computes a cross-validated test loss on a disjoint test set.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | CV.CMB3S(
D,
nsing,
Bsing = 1,
B = 100,
alpha = 1,
singfam = Gaussian(),
evalfam = Gaussian(),
sing = FALSE,
M = 10,
m_iter = 100,
kap = 0.1,
LS = FALSE,
best = 1,
wagg,
gridtype,
grid,
ncmb,
CVind,
targetfam = Gaussian(),
print = TRUE,
robagg = FALSE,
lower = 0,
singcoef = FALSE,
Mfinal = 10,
...
)
|
D |
Data matrix. Has to be an n \times (p+1)-dimensional data frame in the format (X,Y). The X-part must not contain an intercept column containing only ones since this column will be added automatically. |
nsing |
Number of observations (rows) used for the SingBoost submodels. |
Bsing |
Number of subsamples based on which the SingBoost models are validated. Default is 1. Not to confuse with parameter |
B |
Number of subsamples based on which the CMB models are validated. Default is 100. Not to confuse with |
alpha |
Optional real number in ]0,1]. Defines the fraction of best SingBoost models used in the aggregation step. Default is 1 (use all models). |
singfam |
A SingBoost family. The SingBoost models are trained based on the corresponding loss function. Default is |
evalfam |
A SingBoost family. The SingBoost models are validated according to the corresponding loss function. Default is |
sing |
If |
M |
An integer between 2 and |
m_iter |
Number of SingBoost iterations. Default is 100. |
kap |
Learning rate (step size). Must be a real number in ]0,1]. Default is 0.1 It is recommended to use a value smaller than 0.5. |
LS |
If a |
best |
Needed in the case of localized ranking. The parameter |
wagg |
Type of row weight aggregation. |
gridtype |
Choose between |
grid |
The grid for the thresholds (in ]0,1]) or the numbers of final variables (positive integers). |
ncmb |
Number of samples used for |
CVind |
A list where each element contains a vector on length n (number of samples in the data matrix |
targetfam |
Target loss. Should be the same family as |
print |
If set to |
robagg |
Optional. If setting |
lower |
Optional argument. Only reasonable when setting |
singcoef |
Default is |
Mfinal |
Optional. Necessary if |
... |
Optional further arguments |
In CMB3S
, a validation set is given based on which the optimal stable model is chosen. The CV.CMB3S
function adds an outer cross-validation step such that both the training and the validation data sets (and
optionally the test data sets) are chosen randomly by disjointly dividing the initial data set. The aggregated
stable models form an ”ultra-stable” model. It is strongly recommended to use this function is a parallelized
manner due to huge computation time.
Cross-validated loss |
A vector containing the cross-validated test losses. |
Ultra-stable column measure |
A vector containing the aggregated selection frequencies of the stable models. |
Werner, T., Gradient-Free Gradient Boosting, PhD Thesis, Carl von Ossietzky University Oldenburg, 2020
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.