nlopt.ui.general: Function for the determination of the population thresholds...

Description Usage Arguments Details Value References Examples

View source: R/nlopt.ui.general.R

Description

Function for the determination of the population thresholds an uncertain and inconclusive interval for test scores with a known common distribution.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
nlopt.ui.general(
  UI.Se = 0.55,
  UI.Sp = 0.55,
  distribution = "norm",
  parameters.d0 = c(mean = 0, sd = 1),
  parameters.d1 = c(mean = 1, sd = 1),
  overlap.interval = NULL,
  intersection = NULL,
  start = NULL,
  print.level = 0
)

Arguments

UI.Se

(default = .55). Desired sensitivity of the test scores within the uncertain interval. A value <= .5 is not allowed.

UI.Sp

(default = .55). Desired specificity of the test scores within the uncertain interval. A value <= .5 is not allowed.

distribution

Name of the continuous distribution, exact as used in R package stats. Equal to density function minus d. For instance when the density function is 'dnorm', then the distribution is 'norm'.

parameters.d0

Named vector of population values or estimates of the parameters of the distribution of the test scores of the persons without the targeted condition. For instance c(mean = 0, sd = 1). This distribution should have the lower values.

parameters.d1

Named vector of population values or estimates of the parameters of the distribution of the test scores of the persons with the targeted condition. For instance c(mean = 1, sd = 1). The test scores of d1 should have higher values than d0. If not, use -(test scores). This distribution should have the higher values.

overlap.interval

A vector with a raw estimate of the lower and upper relevant of the overlap of the two distributions. If NULL, set to quantile .001 of the distribution of persons with the targeted condition and quantile .999 of the distribution of persons without the condition. Please check whether this is a good estimate of the relevant overlap.

intersection

Default NULL. If not null, the supplied value is used as the estimate of the intersection of the two bi-normal distributions. Otherwise, it is calculated.

start

Default NULL. If not null, the first two values of the supplied vector are used as the starting values for the nloptr optimization function.

print.level

Default is 0. The option print.level controls how much output is shown during the optimization process. Possible values: 0) (default) no output; 1) show iteration number and value of objective function; 2) 1 + show value of (in)equalities; 3) 2 + show value of controls.

Details

The function can be used to determinate the uncertain interval of the two continuous distributions. The Uncertain Interval is defined as an interval below and above the intersection of the two distributions, with a sensitivity and specificity below a desired value (default .55).

Important: The test scores of d1 should have higher values than d0. If not, use -(test scores). The distribution with parameters d1 should have the higher test scores.

Important: This is a highly complex function, which is less user friendly and more error prone than other functions. It is included to show that the technique also works with continuous distributions other than bi-normal. Use for bi-normal distributions always ui.binormal.

Only a single intersection is assumed (or a second intersection where the overlap is negligible).

The function uses an optimization algorithm from the nlopt library

(https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/).

It uses the sequential quadratic programming (SQP) algorithm for nonlinear constrained gradient-based optimization (supporting both inequality and equality constraints), based on the implementation by Dieter Kraft (1988; 1944).

N.B. When a normal distribution is expected, the functions nlopt.ui and ui.binormal are recommended.

Value

List of values:

$status:

Integer value with the status of the optimization (0 is success).

$message:

More informative message with the status of the optimization

$results:

Vector with the following values:

$solution:

Vector with the following values:

References

Dieter Kraft, "A software package for sequential quadratic programming", Technical Report DFVLR-FB 88-28, Institut für Dynamik der Flugsysteme, Oberpfaffenhofen, July 1988.

Dieter Kraft, "Algorithm 733: TOMP–Fortran modules for optimal control calculations," ACM Transactions on Mathematical Software, vol. 20, no. 3, pp. 262-281 (1994).

Landsheer, J. A. (2018). The Clinical Relevance of Methods for Handling Inconclusive Medical Test Results: Quantification of Uncertainty in Medical Decision-Making and Screening. Diagnostics, 8(2), 32. https://doi.org/10.3390/diagnostics8020032

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# A simple test model:
nlopt.ui.general(UI.Se = .55, UI.Sp = .55,
                 distribution = "norm",
                 parameters.d0 = c(mean = 0, sd = 1),
                 parameters.d1 = c(mean = 1, sd = 1),
                 overlap.interval=c(-2,3))
# Standard procedure when using a continuous distribution:
nlopt.ui.general(parameters.d0 = c(mean = 0, sd = 1),
                 parameters.d1 = c(mean = 1.6, sd = 2))

# library(MASS)
# library(car)
# gamma distributed data
set.seed(4)
d0 = rgamma(100, shape=2, rate=.5)
d1 = rgamma(100, shape=7.5, rate=1)
# 1. obtain parameters
parameters.d0=MASS::fitdistr(d0, 'gamma')$estimate
parameters.d1=MASS::fitdistr(d1, 'gamma')$estimate
# 2. test if supposed distributions (gamma) is fitting
car::qqPlot(d0, distribution='gamma', shape=parameters.d0['shape'])
car::qqPlot(d1, distribution='gamma', shape=parameters.d1['shape'])
# 3. draw curves and determine overlap
curve(dgamma(x, shape=parameters.d0['shape'], rate=parameters.d0['rate']), from=0, to=16)
curve(dgamma(x, shape=parameters.d1['shape'], rate=parameters.d1['rate']), from=0, to=16, add=TRUE)
overlap.interval=c(1, 15) # ignore intersection at 0; observe large overlap
# 4. get empirical AUC
simple_auc(d0, d1)
# about .65 --> Poor
# .90-1 = excellent (A)
# .80-.90 = good (B)
# .70-.80 = fair (C)
# .60-.70 = poor (D)
# .50-.60 = fail (F)
# 5. Get uncertain interval
(res=nlopt.ui.general (UI.Se = .57,
                       UI.Sp = .57,
                       distribution = 'gamma',
                       parameters.d0 = parameters.d0,
                       parameters.d1 = parameters.d1,
                       overlap.interval,
                       intersection = NULL,
                       start = NULL,
                       print.level = 0))
abline(v=c(res$intersection, res$solution))
# 6. Assess improvement when diagnosing outside the uncertain interval
sel.d0 = d0 < res$solution[1] |  d0 > res$solution[2]
sel.d1 = d1 < res$solution[1] |  d1 > res$solution[2]
(percentage.selected.d0 = sum(sel.d0) / length(d0))
(percentage.selected.d1 = sum(sel.d1) / length(d1))
simple_auc(d0[sel.d0], d1[sel.d1])
# AUC for selected scores outside the uncertain interval
simple_auc(d0[!sel.d0], d1[!sel.d1])
# AUC for deselected scores; worst are deselected
# weibull distributed data
set.seed(4)
d0 = rweibull(100, shape=3, scale=50)
d1 = rweibull(100, shape=3, scale=70)
# 1. obtain parameters
parameters.d0=MASS::fitdistr(d0, 'weibull')$estimate
parameters.d1=MASS::fitdistr(d1, 'weibull')$estimate
# 2. test if supposed distributions (gamma) is fitting
car::qqPlot(d0, distribution='weibull', shape=parameters.d0['shape'])
car::qqPlot(d1, distribution='weibull', shape=parameters.d1['shape'])
# 3. draw curves and determine overlap
curve(dweibull(x, shape=parameters.d0['shape'],
      scale=parameters.d0['scale']), from=0, to=150)
curve(dweibull(x, shape=parameters.d1['shape'],
      scale=parameters.d1['scale']), from=0, to=150, add=TRUE)
overlap.interval=c(1, 100) # ignore intersection at 0; observe overlap
# 4. get empirical AUC
simple_auc(d0, d1)
# about .65 --> Poor
# .90-1 = excellent (A)
# .80-.90 = good (B)
# .70-.80 = fair (C)
# .60-.70 = poor (D)
# .50-.60 = fail (F)
# 5. Get uncertain interval
(res=nlopt.ui.general (UI.Se = .55,
                       UI.Sp = .55,
                       distribution = 'weibull',
                       parameters.d0 = parameters.d0,
                       parameters.d1 = parameters.d1,
                       overlap.interval,
                       intersection = NULL,
                       start = NULL,
                       print.level = 0))
abline(v=c(res$intersection, res$solution))
# 6. Assess improvement when diagnosing outside the uncertain interval
sel.d0 = d0 < res$solution[1] |  d0 > res$solution[2]
sel.d1 = d1 < res$solution[1] |  d1 > res$solution[2]
(percentage.selected.d0 = sum(sel.d0) / length(d0))
(percentage.selected.d1 = sum(sel.d1) / length(d1))
simple_auc(d0[sel.d0], d1[sel.d1])
# AUC for selected scores outside the uncertain interval
simple_auc(d0[!sel.d0], d1[!sel.d1])
# AUC for deselected scores; these scores are almost indistinguishable

UncertainInterval documentation built on March 3, 2021, 1:10 a.m.