MS2: Threshold selection method. Median Sweep

Description Usage Arguments Value References Examples

View source: R/MS2_method.r

Description

It quantifies events using a modified version of the MS method that considers only thresholds where the denominator (tpr-fpr) is greater than 0.25.

Usage

1
MS2(test, TprFpr)

Arguments

test

a numeric vector containing the score estimated for the positive class from each test set instance.

TprFpr

a data.frame of true positive (tpr) and false positive (fpr) rates estimated on training set, using the function getTPRandFPRbyThreshold().

Value

A numeric vector containing the class distribution estimated from the test set.

References

Forman, G. (2006, August). Quantifying trends accurately despite classifier error and class imbalance. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 157-166).<doi.org/10.1145/1150402.1150423>.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
library(randomForest)
library(caret)
cv <- createFolds(aeAegypti$class, 3)
tr <- aeAegypti[cv$Fold1,]
validation <- aeAegypti[cv$Fold2,]
ts <- aeAegypti[cv$Fold3,]

# -- Getting a sample from ts with 80 positive and 20 negative instances --
ts_sample <- rbind(ts[sample(which(ts$class==1),80),],
                   ts[sample(which(ts$class==2),20),])
scorer <- randomForest(class~., data=tr, ntree=500)
scores <- cbind(predict(scorer, validation, type = c("prob")), validation$class)
TprFpr <- getTPRandFPRbyThreshold(scores)
test.scores <- predict(scorer, ts_sample, type = c("prob"))
MS2(test = test.scores[,1], TprFpr = TprFpr)

mlquantify documentation built on Jan. 20, 2022, 5:07 p.m.