AROC.sp: Semiparametric frequentist inference of the...
In AROC: Covariate-Adjusted Receiver Operating Characteristic Curve Inference

AROC.sp

R Documentation

Semiparametric frequentist inference of the covariate-adjusted ROC curve (AROC).

Description

Estimates the covariate-adjusted ROC curve (AROC) using the semiparametric approach proposed by Janes and Pepe (2009).

Usage

AROC.sp(formula.healthy, group, tag.healthy, data, 
	est.surv.h = c("normal", "empirical"), p = seq(0, 1, l = 101), B = 1000)

Arguments

`formula.healthy`	A `formula` object specifying the location regression model to be fitted in healthy population (see Details).
`group`	A character string with the name of the variable that distinguishes healthy from diseased individuals.
`tag.healthy`	The value codifying the healthy individuals in the variable `group`.
`data`	Data frame representing the data and containing all needed variables.
`est.surv.h`	A character string. It indicates how the conditional distribution function of the diagnostic test in healthy population is estimated. Options are "normal" and "empirical" (see Details). The default is "normal".
`p`	Set of false positive fractions (FPF) at which to estimate the covariate-adjusted ROC curve.
`B`	An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. By default 1000.

Details

Estimates the covariate-adjusted ROC curve (AROC) defined as

AROC≤ft(t\right) = Pr\{1 - F_{\bar{D}}(Y_D | \mathbf{X}_{D}) ≤q t\},

where F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}}) denotes the conditional distribution function for Y_{\bar{D}} conditional on the vector of covariates \mathbf{X}_{\bar{D}}. In particular, the method implemented in this function estimates the outer probability empirically (see Janes and Pepe, 2008) and F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}}) is estimated assuming a semiparametric location regression model for Y_{\bar{D}}, i.e.,

Y_{\bar{D}} = \mathbf{X}_{\bar{D}}^{T}\mathbf{β}_{\bar{D}} + σ_{\bar{D}}\varepsilon_{\bar{D}},

such that, for a random sample \{(\mathbf{x}_{\bar{D}i})\}_{i=1}^{n_{\bar{D}}} from the healthy population, we have

F_{\bar{D}}(y | \mathbf{X}_{\bar{D}}=\mathbf{x}_{\bar{D}i}) = F_{\bar{D}}≤ft(\frac{y-\mathbf{x}_{\bar{D}i}^{T}\mathbf{β}_{\bar{D}}}{σ_{\bar{D}}}\right),

where F_{\bar{D}} is the distribution function of \varepsilon_{\bar{D}}. In line with the assumptions made about the distribution of \varepsilon_{\bar{D}}, estimators will be referred to as: (a) "normal", where Gaussian error is assumed, i.e., F_{\bar{D}}(y) = Φ(y); and, (b) "empirical", where no assumption is made about the distribution (in this case, the distribution function F_{\bar{D}} is empirically estimated on the basis of standardised residuals).

Value

As a result, the function provides a list with the following components:

`call`	The matched call.
`p`	Set of false positive fractions (FPF) at which the pooled ROC curve has been estimated
`ROC`	Estimated covariate-adjusted ROC curve (AROC), and 95% pointwise confidence intervals (if required)
`AUC`	Estimated area under the covariate-adjusted ROC curve (AAUC), and 95% pointwise confidence intervals (if required).
`fit.h`	Object of class `lm` with the fitted regression model in the healthy population.
`est.surv.h`	The value of the argument `est.surv.h` used in the call.

References

Janes, H., and Pepe, M.S. (2009). Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. Biometrika, 96(2), 371 - 382.

Examples

library(AROC)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]

# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)

m3 <- AROC.sp(formula.healthy = l_marker1 ~ age,
group = "status", tag.healthy = 0, data = newpsa,
p = seq(0,1,l=101), B = 500)

summary(m3)

plot(m3)

AROC documentation built on March 18, 2022, 5:22 p.m.