bbc: Bootstrap bias correction for the performance of the...
In MXM: Feature Selection (Including Multiple Solutions) and Bayesian Networks

Bootstrap bias correction for the performance of the cross-validation procedure

R Documentation

Bootstrap bias correction for the performance of the cross-validation procedure

Description

Bootstrap bias correction for the performance of the cross-validation procedure.

Usage

bbc(predictions, target, metric = "auc.mxm", conf = 0.95, B = 1000)

Arguments

`predictions`	A matrix with the predictived values.
`target`	A vector with the target variable, survival object, factor (ordered or unordered) or a numerical vector.
`metric`	The possible values are: a) Binary target: "auc.mxm" (area under the curve), "fscore.mxm" (F-score), "prec.mxm" (precision), "euclid_sens.spec.mxm" (Euclidean distance of sensitivity and specificity), "spec.mxm" (specificity), "sens.mxm" (sensitivity), "acc.mxm" (accuracy, proportion of correct classification). b) Multinomial target: "acc_multinom.mxm" (accuracy, proportion of correct classification). c) Ordinal target: "ord_mae.mxm" (mean absolute error). d) Continuous target: "mae.mxm" (MAE with continuous target), "mse.mxm" (mean squared error), "pve.mxm" (percentage of variance explained). e) Survival target "ci.mxm" (concordance index for Cox regression), "ciwr.mxm" (concordance index for Weibull regression). g) Count target "poisdev.mxm". h) Binomial target "binomdev.mxm" (deviance of binomial regression). The "nbdev.mxm" (negative binomial deviance) is missing. For more information on these see `cv.ses`. boldNote that they come with "".
`conf`	A number between 0 and 1, the confidence level.
`B`	The number of bootstrap replicates. The default number is 1000.

Details

Upon completion of the cross-validation, the predicted values produced by all predictive models across all folds is collected in a matrix P of dimensions n \times M, where n is the number of samples and M the number of trained models or configurations. Sampled with replacement a fraction of rows (predictions) from P are denoted as the in-sample values. On average, the newly created set will be comprised by 63.2% of the original individuals (The probability of sampling, with replacement, a sample of n numbers from a set of n numbers is 1-≤ft(1-\frac{1}{n} \right)^n \simeq 1-\frac{1}{e}=0.632), whereas the rest 36.8% will be random copies of them. The non re-sampled rows are denoted as out-of-sample values. The performance of each model in the in-sample rows is calculated and the model (or configuration) with the optimal performance is selected, followed by the calculation of performance in the out-of-sample values. This process is repeated B times and the average performance is returned.

Note, that the only computational overhead is with the repetitive re-sampling and calculation of the predictive performance, i.e. no model is fitted nor trained. The final estimated performance usually underestimates the true performance, but this negative bias is smaller than the optimistic uncorrected performance.

Note, that all metrics are for maximization. For this reason "mse.mxm", "mae.mxm", "ord_mae.mxm", "poisdev.mxm", "binomdev.mxm" are multiplied by -1.

Value

A list including:

`out.perf`	The B out sampled performances. Their mean is the "bbc.perf" given above.
`bbc.perf`	The bootstrap bias corrected performance of the chosen algorithm, model or configuration.
`ci`	The (1- conf)% confidence interval of the BBC performance. It is based on the empirical or percentile method for bootstrap samples. The lower and upper 2.5% of the "out.perf".

Author(s)

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Ioannis Tsamardinos, Elissavet Greasidou and Giorgos Borboudakis (2018). Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Machine Learning (To appear).

https://link.springer.com/article/10.1007/s10994-018-5714-4

Examples

predictions <- matrix(rbinom(200 * 100, 1, 0.7), ncol = 100) 
target <- rbinom(200, 1, 0.5)
bbc(predictions, target, metric = "auc.mxm")

MXM documentation built on Aug. 25, 2022, 9:05 a.m.

MXM index

Package overview Tutorial: Feature selection with the MMPC algorithm' Tutorial: Feature selection with the SES algorithm"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MXM
Feature Selection (Including Multiple Solutions) and Bayesian Networks

bbc: Bootstrap bias correction for the performance of the...
In MXM: Feature Selection (Including Multiple Solutions) and Bayesian Networks

Bootstrap bias correction for the performance of the cross-validation procedure

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to bbc in MXM...

R Package Documentation

Browse R Packages

We want your feedback!

MXM Feature Selection (Including Multiple Solutions) and Bayesian Networks

bbc: Bootstrap bias correction for the performance of the... In MXM: Feature Selection (Including Multiple Solutions) and Bayesian Networks

Bootstrap bias correction for the performance of the cross-validation procedure

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to bbc in MXM...

R Package Documentation

Browse R Packages

We want your feedback!

MXM
Feature Selection (Including Multiple Solutions) and Bayesian Networks

bbc: Bootstrap bias correction for the performance of the...
In MXM: Feature Selection (Including Multiple Solutions) and Bayesian Networks