thres2: Threshold point estimation and confidence intervals...

View source: R/ThresholdROC-2states.R

thres2R Documentation

Threshold point estimation and confidence intervals (two-state setting)

Description

This function calculates the threshold estimate and its corresponding confidence interval in a two-state setting.

Usage

thres2(k1, k2, rho,
  costs = matrix(c(0, 0, 1, (1 - rho)/rho), 2, 2, byrow = TRUE),
  R=NULL,
  method = c("equal", "unequal", "empirical", "smooth", "parametric"),
  dist1 = NULL, dist2 = NULL, ci = TRUE, ci.method = c("delta", "boot"),
  B = 1000, alpha = 0.05, extra.info = FALSE, na.rm = FALSE, q1=0.05, q2=0.95)

Arguments

k1

vector containing the healthy sample values.

k2

vector containing the diseased sample values.

rho

disease prevalence.

costs

cost matrix. Costs should be entered as a 2x2 matrix, where the first row corresponds to the true positive and true negative costs and the second row to the false positive and false negative costs. Default cost values are a combination of costs that yields R=1, which is equivalent to the Youden index method (for details about this concept, see References). It must be set to NULL if the user prefers to set R (see next argument).

R

if the cost matrix costs is not set, R desired (the algorithm will choose a suitable combination of costs that leads to R). Default, NULL (which leads to R=1 using the default costs).

method

method used in the estimation. The user can specify just the initial letters. Default, "equal". See Details for more information about the methods available.

dist1

distribution to be assumed for the healthy population. See Details.

dist2

distribution to be assumed for the diseased population. See Details.

ci

should a confidence interval be calculated? Default, TRUE. The user can set it to FALSE to supress the calculation of any confidence interval (in that case, arguments ci.method, B and alpha are ignored).

ci.method

method to be used for the confidence intervals calculation. The user can specify just the initial letters. Default, "delta". See Details for more information about the methods available.

B

number of bootstrap resamples when ci.method = "boot". Otherwise, ignored. Default, 1000.

alpha

significance level for the confidence interval. Default, 0.05.

extra.info

when using method="empirical", if set to TRUE the function returns extra information about the calculation of the threshold. Ignored when method is not "empirical". Default, FALSE.

na.rm

a logical value indicating whether NA values in k1 and k2 should be stripped before the computation proceeds. Default, FALSE.

q1

probability of the left distribution in order to determine a low quantile when method="parametric" (ignored otherwise). Default, 0.05.

q2

probability of the right distribution in order to determine a high quantile when method="parametric" (ignored otherwise). Default, 0.95.

Details

For parameter method the user can choose between "equal" (assumes binormality and equal variances), "unequal" (assumes binormality and unequal variances), "empirical" (leaves out any distributional assumption), "smooth" (leaves out any distributional assumption, but uses a kernel to estimate the densities) or "parametric" (based on the distribution assumed for the two populations).

Parameters dist1 and dist2 can be chosen between the following 2-parameter distributions: "beta", "cauchy", "chisq" (chi-squared), "gamma", "lnorm" (lognormal), "logis" (logistic), "norm" (normal) and "weibull". Notice that dist1 and dist2 are only needed when method = "parametric".

For parameter ci.method the user can choose between "delta" (delta method is used to estimate the threshold standard error assuming a binormal underlying model) or "boot" (the confidence interval is calculated by bootstrap).

Value

An object of class thres2, which is a list with two components:

T

a list of at least seven components:

thres threshold estimate.

prev disease prevalence provided by the user.

costs cost matrix provided by the user.

R R term, the product of the non-disease odds and the cost ratio (for further details about this concept, see References).

method method used in the estimation.

k1 vector containing the healthy sample values provided by the user.

k2 vector containing the diseased sample values provided by the user.

When method = "empirical", T also contains:

sens sensitivity obtained.

spec specificity obtained.

cost the minimum cost associated with T$thres.

tot.thres vector of possible thresholds. Only if extra.info = TRUE.

tot.cost vector of empirical costs. Only if extra.info = TRUE.

tot.spec.c complementary of the vector of empirical specificities (1-spec). Only if extra.info = T.

tot.sens vector of empirical sensitivities. Only if extra.info = TRUE.

When method = "parametric", T also contains:

dist1 distribution assumed for the healthy population.

dist2 distribution assumed for the diseased population.

pars1 a numeric vector containing the estimation of the parameters of dist1.

pars2 a numeric vector containing the estimation of the parameters of dist2.

CI

When ci.method = "delta", a list of five components:

lower the lower limit of the confidence interval.

upper the upper limit of the confidence interval.

se the standard error used in the calculation of the confidence interval.

alpha significance level provided by the user.

ci.method method used for the confidence intervals calculation.

When ci.method = "boot", a list of eight components:

low.norm the lower limit of the bootstrap confidence interval based on the normal distribution.

up.norm the upper limit of the bootstrap confidence interval based on the normal distribution.

se the bootstrap standard error used in the calculation of the confidence interval based on the normal distribution.

low.perc the lower limit of the bootstrap confidence interval based on percentiles.

up.perc the upper limit of the bootstrap confidence interval based on percentiles.

alpha significance level provided by the user.

B number of bootstrap resamples used.

ci.method method used for the confidence intervals calculation.

When ci = FALSE, NULL.

Note

It is assumed that k1 is the sample with lower values. If that is not the case, k1 and k2 (and the corresponding parameters) are exchanged.

References

Efron B, Tibshirani RJ. (1993). An introduction to the bootstrap, Chapman & Hall.

Skaltsa K, Jover L, Carrasco JL. (2010). Estimation of the diagnostic threshold accounting for decision costs and sampling uncertainty. Biometrical Journal 52(5):676-697.

See Also

thresTH2, plot.thres2, lines.thres2

Examples

# example 1
n1 <- 100
n2 <- 100
set.seed(1234)
par1.1 <- 0
par1.2 <- 1
par2.1 <- 2
par2.2 <- 1
rho <- 0.2
k1 <- rnorm(n1, par1.1, par1.2) # non-diseased
k2 <- rnorm(n2, par2.1, par2.2) # diseased

thres2(k1, k2, rho, method="eq", ci.method="d")
thres2(k1, k2, rho, method="uneq", ci.method="d")
# specify R instead of (default) costs
thres2(k1, k2, rho, costs=NULL, R=2, method="uneq", ci.method="d")
## Not run: 
thres2(k1, k2, rho, method="empirical", ci.method="b")

# example 2
set.seed(1234)
k1 <- rnorm(50, 10, 3)
k2 <- rlnorm(55)
rho <- 0.3
thres2(k1, k2, rho, method="param", ci.method="boot", dist1="norm", dist2="lnorm")

## End(Not run)

# supress confidence intervals calculation
thres2(k1, k2, rho, method="equal", ci=FALSE)
thres2(k1, k2, rho, method="empirical", ci=FALSE)

# example 3
n1 <- 100
n2 <- 100
set.seed(1234)
par1.1 <- 0
par1.2 <- 1
par2.1 <- 2
par2.2 <- 1
rho <- 0.2
k1 <- rnorm(n1, par1.1, par1.2) # non-diseased
k2 <- rnorm(n2, par2.1, par2.2) # diseased
## Not run: 
thres2(k1, k2, rho, method="smooth", ci.method="b")

## End(Not run)


ThresholdROC documentation built on Aug. 30, 2023, 1:08 a.m.