thresholds: Collection of functions to estimate decision thresholds for...
In DiagnosisMed: Diagnostic test accuracy evaluation for health professionals

Description Usage Arguments Details Value See Also Examples

This collection is intended to be a intermediate function, not be used by the end user. One may wish to use TGROC instead, as it calls thresholds and other functions at once for a more complete analysis including a flexible plot function.

By a variety of methods, this collection may either dichotomize or threechotomize a diagnostic test scale, finding the thresholds to classify subjects with and without a condition. Along with the threshold, these functions return the sensitivity, specificity and the positive likelihood ratios (and their respective confidence interval) if the test is dichotomized at that threshold. The thresholds function is a wrapper for all methods at once. The methods currently available are (see details for the formulas):

Se.equals.Sp The threshold which Sensitivity is equal to Specificity.
Max.Accuracy The threshold which maximize the accuracy.
Max.DOR The threshold which maximize the diagnostic odds ratio.
Min.Error The threshold which minimizes the error rate.
Max.Accuracy.area The threshold which maximize the accuracy area.
Max.Youden The threshold which maximize the Youden J index.
Min.ROC.dist The threshold which minimize the distance between the curve and the upper left corner of the graph.
Max.Efficiency The threshold which maximize the efficiency.
Min.MCT The threshold which minimize the misclassification cost term.
thresholds Wrapper for all the above methods.
inc.limits Threechotomizes the test according to a minimum required sensitivity and specificity.

Se.equals.Sp(x)

Max.Accuracy(x)

Max.DOR(x)

Min.Error(x)

Max.Accuracy.area(x)

Max.Youden(x)

Min.ROCdist(x)

Max.Efficiency(x, pop.prevalence = NULL)

Min.MCT(x, pop.prevalence = NULL, Cost = 1)

thresholds(x, pop.prevalence = NULL, Cost = 1)

inc.limits(x, Inconclusive = 0.95)

`x`	is the output from any of the `SS`, `BN.SS` or `NN.SS` functions.
`pop.prevalence`	The disease prevalence informed by the user. If not informed, it will be the same as the sample prevalence. This will be passed to `Max.Efficiency` and `Min.MCT`. Particularly interesting if the test will be applied to a population with a different condition prevalence.
`Cost`	Cost = cost(FN)/cost(FP). Use in the `Min.MCT` (minimizing the misclassification cost term) function. It is a value in a range from 0 to infinite. Could be financial cost or a health outcome with the perception that FN are more undesirable than FP (or the other way around). Cost = 1 means FN and FP have even cost. Cost = 0.9 means FP are 10 percent more costly. Cost = 0.769 means that FP are 30 percent more costly. Cost = 0.555 means that FP are 80 percent more costly. Cost = 0.3 means that FP are 3 times more costly. Cost = 0.2 means that FP are 5 times more costly. Also, it can be more easily inserted as any ratio such as 1/2.5 or 1/4.
`Inconclusive`	This is a value that ranges from 0 to 1 that will identify the test range where the performance of the test is not acceptable and thus considered inconclusive. It represents the researcher tolerance of how good the test should be. If it is set to 0.95 (which is the default value), test results that have less than 0.95 sensitivity and specificity will be in the inconclusive range. Also known as the minimum required sensitivity and specificity.

Occasionally the dichotomizing methods may find ties, i.e. more than one threshold matching the criteria. This may occur particularly in small sample sizes or with the non-parametric analysis. If this is the case, the functions will return a warning and automatically pick the median value, or the value closest to the median.

Similar phenomena may occur with the inc.limits function. If this is the case, for specificity, the function will pick the lowest threshold, and for sensitivity, the function will pick the highest threshold. Additionally, one may notice that frequently the sensitivity and specificity output does not exactly match the Inconclusive argument, e.g. 0.90. That depends on the data, that may have small or big jumps of sensitivity and specificity from one threshold to another. It seems this phenomena is more frequent with small sample sizes, and with the non-parametric analysis. If this is the case, the inc.limits will always pick the threshold with the sensitivity and specificity nearest to the Inconclusive argument.

The formulas for the methods used are:

Max.Accuracy (TN+TP)/sample size
Max.DOR (TP*TN)/(FN*FP)
Min.Error (FN+FP)/sample size
Max.Accuracy.area (TP*TN)/((TP+FN)*(FP+TN))
Max.Youden Se+Sp-1
Min.ROC.dist (Sp-1)^2+(1-Se)^2
Max.Efficiency Se*prevalence+(1-prevalence)*Sp
Min.MCT (1-prevalence)*(1-Sp)+Cost*prevalence(1-Se)

A data.frame with the threshold (test.value), sensitivity, specificity and positive likelihood ratio (with their respective confidence interval). The row names will match the methods.

SS, TGROC

data(rocdata)

# Thresholds and inconclusive limits from non-parametric ROC analysis
x <- SS(ref = rocdata$Gold, test = rocdata$test2)
thresholds(x)
inc.limits(x)

# Thresholds and inconclusive limits from smoothed analysis with NN
y <- NN.SS(x)
thresholds(y)
inc.limits(y)

data("tutorial")

# Thresholds and inconclusive limits at a very accurate test.
x <- SS(ref = ifelse(tutorial$Gold == "pos", 1, 0), test = tutorial$Test_A)
# Notice that ties occurs.
thresholds(x)
# Notice that the "Lower  inconclusive" test.values is higher than the "Upper inconclusive"
# This may occur if the test is very accurate, or argument "Inconclusive" is too low.
# In this case a inconclusive range may not be applicable.
inc.limits(x)
# But, for this data, increasing the "Inconclusive" solves the issue.
inc.limits(x, Inconclusive = .99)

# When smoothing the analysis, the ties do not occur.
y <- NN.SS(x)
thresholds(y)
# Same issue with the parametric analysis.
# But it returns very different threshold values.
inc.limits(y, Inconclusive = .99)

rm(rocdata, tutorial, x, y)