error_632: .632(+) Estimator for log loss error rate

View source: R/error.R

error_632R Documentation

.632(+) Estimator for log loss error rate

Description

The .632 estimator for the log loss error rate is calculated for a given classifier. The .632+ estimator is an extension that reduces overfitting and is run by default.

Usage

error_632(data, class, algorithm, pred, test.id, train.id, plus = TRUE)

Arguments

data

data frame with rows as samples, columns as features

class

true/reference class vector used for supervised learning

algorithm

character string for classifier. See splendid for possible options.

pred

vector of OOB predictions using the same classifier as algorithm.

test.id

vector of test set indices for each bootstrap replicate

train.id

vector of training set indices for each bootstrap replicate

plus

logical; if TRUE (default), the .632+ estimator is calculated. Otherwise, the .632 estimator is calculated.

Details

This function is intended to be used internally by splendid_model.

Value

the .632(+) log loss error rate

Author(s)

Derek Chiu

References

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. New York: Springer series in statistics, 2001.

Efron, Bradley and Tibshirani, Robert (1997), "Improvements on Cross-Validation: The .632+ Bootstrap Method," Journal of American Statistical Association, 92, 438, 548-560.

Examples

## Not run: 
data(hgsc)
class <- as.factor(attr(hgsc, "class.true"))
set.seed(1)
train.id <- boot_train(data = hgsc, class = class, n = 5)
test.id <- boot_test(train.id = train.id)
mod <- purrr::map(train.id, ~ classification(hgsc[., ], class[.], "xgboost"))
pred <- purrr::pmap(list(mod = mod, test.id = test.id, train.id = train.id),
prediction, data = hgsc, class = class)
error_632(hgsc, class, "xgboost", pred, test.id, train.id, plus = FALSE)
error_632(hgsc, class, "xgboost", pred, test.id, train.id, plus = TRUE)

## End(Not run)

AlineTalhouk/splendid documentation built on Feb. 23, 2024, 9:37 p.m.