SS: ROC analysis - Sensitivity and Specificity trade-off

Description Usage Arguments Details Value See Also Examples

View source: R/SS.R

Description

SS collection is intended to be intermediate functions, not be use by the end user. One may wish to use TGROC instead, as it calls SS, BN.SS and NN.SS and other functions at once for a more complete analysis including a flexible plot function.

SS, BN.SS and NN.SS compute validity measures for each decision threshold of a continuous scale diagnostic test with their respective confidence intervals. It shows the trade-off of the Sensitivity and Specificity values with progressive changes of the threshold. SS does it non-parametrically and uses binom.CI to estimate the confidence intervals. NN.SS does it by fitting a feed forward neural network with the AMORE package. The underling idea is that this analysis is a robust way to smooth the Sensitivity and Specificity trade-off and represent the population, as neural networks may approximate any population function distribution. One may notice that running the neural network more than once may retun slightly different values. This is expected as it depends on the fit for each run. BN.SS does the same thing but assuming that the test values from subjects with and without the condition have Gaussian distribution (bi-normally distributed test values). Both BN.SS and NN.SS use a Gaussian confidence interval estimation.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
BN.SS(ref, test, CL = 0.95, t.max = NULL, t.min = NULL,
  precision = 0.01, pop.prevalence = NULL)

NN.SS(x, t.max = NULL, t.min = NULL, precision = 0.01, CL = 0.95,
  n.neurons = c(1, 5, 1), learning.rate.global = 0.01,
  momentum.global = 0.3, error.criterium = "LMS", Stao = NA,
  hidden.layer = "sigmoid", output.layer = "sigmoid",
  method = "ADAPTgdwm", report = FALSE, show.step = 5000, n.shows = 1)

SS(ref, test, reverse = "auto", CL = 0.95, binom.conf = "wilson",
  pop.prevalence = NULL)

Arguments

ref

The reference standard. A column in a data frame or a vector indicating the classification by the reference test. The reference standard must be coded either as 0 (absence of the condition) or 1 (presence of the condition).

test

The index test or test under evaluation. A column in a dataset or vector indicating the test results in a continuous scale.

CL

Confidence limit. The limits of the confidence interval. Must be coded as number in range from 0 to 1. Default value is 0.95.

t.min, t.max, precision

Test minimum, maximum and intervals to simulate the parametric estimation as seq(t.min, t.max, precision). The NN.SS and BN.SS functions need test values to which simulate the Sensitivity and Specificity values. If left NULL, internally the function will pick the test maximum and minimum and the default precision. However, it does not need to be the same values observed in data. To give a smooth appearance and allow the NN.SS to fit nicely, the sequence should have hundreds of tests values to simulate, say at least 200 values. A good rationale to create this sequence is to simulate all possible values of the test results. If this rationale does not have 200 values, than, perhaps, every half value should do. At the another extreme, creating a sequence with too many values (e.g. 2000) may create unrealistic test values, increase computational time (or even explode memory) and may not increase the smoothness to the parametric analysis.

pop.prevalence

Population condition prevalence. If this values is not NULL, the sample prevalence is internally replaced by the population prevalence to estimate the confidence intervals and, for NN.SS and BN.SS to estimate the TP, TN, FP, and FN fractions needed in other functions to estimate decision thresholds. So, use it wisely. Particularly interesting with data from case-control design.

x

For the NN.SS function, x is the output of the SS function.

n.neurons

Numeric vector containing the number of neurons of each layer. See newff.

learning.rate.global

Learning rate at which every neuron is trained. See newff.

momentum.global

Momentum for every neuron. See newff.

error.criterium

Criteria used to measure to proximity of the neural network prediction to its target. See newff.

Stao

Stao parameter for the TAO error criteria. See newff.

hidden.layer

Activation function of the hidden layer neurons. See newff.

output.layer

Activation function of the hidden layer neurons. See newff.

method

Preferred training method. See newff.

report

Logical value indicating whether the training function should keep quiet. See train.

show.step

Number of epochs to train non-stop until the training function is allow to report. See train.

n.shows

Number of times to report (if report is TRUE). See train.

reverse

"auto" (default), TRUE or FALSE are the acceptable values. ROC analysis assumes that higher values of the test are from subjects with the condition, and lower values are from subjects without the condition. If it occurs the other way around, the ROC analysis and its interpretation must be reversed. If "auto", SS internally checks if the mean (or median) test values are higher among subject without the condition. If this is the case, it returns a warning and sets the reverse = TRUE. The reversion is simply done by multiplying all test values by -1, make all the computations and returning the absolute values.

binom.conf

Method of binomial confidence interval. "wilson" (default), "exact" and "approximate" are acceptable. See binom.CI.

Details

Tests results matching the cut-off values will be considered a positive test. SS, as all ROC analysis, assumes that subjects with higher values of the test are with the target condition, and those with lower values are without the target condition. Tests that behave like glucose (middle values are supposed to be normal and extreme values are supposed to be abnormal) will not be correctly analyzed. This may be the degenerated data for ROC analysis. If for a particular tests higher values are from subjects without the condition and lower values are from subject with the condition, the analysis must be reversed. In this case, SS does it by multiplying the test results by -1 before analysis, and returning its absolute value before output.

Value

table A dataset with the number of subjects with and without the condition, the TP, TN, FP, and FP, Sensitivity, Specificity, predictive values and likelihood ratios for each threshold.

sample.size The sample size.

sample.prevalence The sample prevalence. However, if the argument pop.prevalence is not NULL, it returns the pop.prevalence despite not replacing the name.

See Also

binom.CI, np.auROCc, thresholds, TGROC

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
data("rocdata")
# Artificially forcing the reversion (this is NOT correct for this data)
x <- SS(ref = rocdata$Gold, test = rocdata$test1, reverse = TRUE)
# Printing just a subset of the table
# In this case sensitivity has higher values at higher test values
tail(x$table[,c("test.values","Sensitivity","Specificity")])
# And specificity has higher values at lower test values
head(x$table[,c("test.values","Sensitivity","Specificity")])

# The same analysis without forcing the reversion
x <- SS(ref = rocdata$Gold, test = rocdata$test1)
# Printing just a subset of the table
# In this case sensitivity has higher values at lower test values
head(x$table[,c("test.values","Sensitivity","Specificity")])
# And specificity has higher values at higher test values
tail(x$table[,c("test.values","Sensitivity","Specificity")])

# Smoothingn with bi-normal function
# It will be easier to check the fit graphically with TGROC
# Rejecting the assumption of normality for those without the condition.
shapiro.test(rocdata$test1[which(rocdata$Gold == 1)])
shapiro.test(rocdata$test1[which(rocdata$Gold == 0)])
z <- BN.SS(ref = rocdata$Gold, test = rocdata$test1, t.min = 0.005, t.max = 2, precision = 0.005)
head(z$table[,c("test.values","Sensitivity","Specificity")])
tail(z$table[,c("test.values","Sensitivity","Specificity")])

# Smoothingn with neural network
y <- NN.SS(x, t.min = 0.005, t.max = 2, precision = 0.005)
head(y$table[,c("test.values","Sensitivity","Specificity")])
tail(y$table[,c("test.values","Sensitivity","Specificity")])

rm(rocdata, x, y, z)

DiagnosisMed documentation built on May 2, 2019, 5:21 p.m.