Home

/

CRAN

/

maxTPR

/

maxTPR: Maximizing the TPR for a Specified FPR

maxTPR: Maximizing the TPR for a Specified FPR
In maxTPR: Maximizing the TPR for a Specified FPR

Description Usage Arguments Value References See Also Examples

View source: R/maxTPR_pkg.R

Often in the risk prediction setting, there is interest in combining several predictors (e.g., biomarkers) into a single tool for prognosis, diagnosis or screening. One way to accomplish this is by targeting a measure of predictive capacity. In many cases, there is interest in the true positive rate (TPR; sensitivity) for a clinically meaningful false positive rate (FPR; 1-specificity). This function estimates a linear combination of predictors by maximizing a smooth approximation to the empirical TPR (sTPR) while constraining a smooth approximation to the empirical FPR (sFPR). Furthermore, since the TPR and FPR are determined both by the linear combination and the threshold (i.e., TPR is the proportion of diseased individuals whose linear combination value exceeds some threshold), this function estimates the combination and the threshold simultaneously. Estimates from robust logistic regression, specifically the method of Bianco and Yohai (implemented via the aucm package), are used as starting values for the linear combination.

1 2	maxTPR(data, tval, initialval="rGLM", alpha = 0.5, approxh = 0.5, tolval = 1e-4, stepsz = 1e-5, multiplier = 2)

`data`	An object of class ‘data.frame’ where the first column contains the outcome (disease) indicator (1 for diseased, 0 for non-diseased), and the subsequent columns are the predictors. Note that missing observations are allowed, but they will be automatically removed. All columns of `data` must be numeric. The columns of `data` will be (re)named "D" for the first column and "V1", "V2", ... for the subsequent (predictor) columns.
`tval`	The acceptable FPR value. The method constrains the smooth approximation to the FPR to be less than or equal to `tval`; see `alpha` below.
`initialval`	Starting values of the predictor combination for the smooth TPR maximization algorithm. Default value is `"rGLM"`, which means that estimates from robust logistic regression, specifically the method of Bianco and Yohai (implemented via the `aucm` package), are used as starting values. If any other value of `initialval` is given, or if robust logistic regression fails to converge, estimates from standard logistic regression are used as starting values.
`alpha`	To improve performance, a small buffer may be added to `tval`. The parameter `alpha` controls the size of this buffer, relative to the number of controls (individuals without the disease). The default value is `alpha`=0.5, meaning that the default buffer is 0.5/(number of controls) so the sFPR is constrained to be less than or equal to `tval` + 0.5/(number of controls).
`approxh`	The tuning parameter for the smooth approximations is the ratio of the standard deviation of the linear combination (based on the starting values) to n^{approxh}, where n is the sample size. In particular, larger values of `approxh` will provide a better approximation to the TPR and FPR, though estimation may become unstable if `approxh` is too large. Default 0.5.
`tolval`	Controls the tolerance on feasibility and optimality for the optimization procedure (performed by `solnp` in the `Rsolnp` package). Default 1e-4.
`stepsz`	Controls the step size for the optimization procedure (performed by `solnp` in the `Rsolnp` package). Default 1e-5.
`multiplier`	Used to provide an initial value for the threshold to the optimization procedure. Using the starting values for the linear combination (based on robust logistic regression), a reasonable choice for this initial value is the threshold such that sFPR = `tval`. This value can found by using the `uniroot` function, which requires a range over which to search. The `multiplier` parameter controls the size of this range; if the range is not wide enough, the error ‘f() values at end points not of opposite sign’ will be seen, and `multiplier` should be increased. The size of `multiplier` will not generally have a large impact on results, though narrower (but valid) ranges may offer slightly better precision in the results from `uniroot`. Default 2.

A list with the following components:

`sTPRrslt`	The results from the smooth TPR maximization procedure, including 'delta' (the threshold estimated by the maximization procedure), 'deltaRE' (the threshold estimated based on quantiles of the combination estimated by the maximization procedure), the estimated combination coefficients, and an indicator of convergence for the optimization procedure.
`rGLMrslt`	The results from the robust logistic regression model (fit using `rlogit`), including 'delta' (the threshold estimated based on quantiles of the combination estimated by robust logistic regression), the estimated combination coefficients, and an indicator of convergence for `rlogit`. Note that if `rlogit` fails to converge, these results will be identical to `GLMrslt`, since in this case, the estimates from standard logistic regression are used in place of those from robust logistic regression. Since the smooth TPR maximization procedure involves constraining the norm of the combination coefficients to be 1 for identifiability, this constraint was also applied to the robust logistic regression results.
`GLMrslt`	The results from the (standard) logistic regression model, including 'delta' (the threshold estimated based on quantiles of the combination estimated by (standard) logistic regression), the estimated combination coefficients, and an indicator of convergence for `glm`. Since the smooth TPR maximization procedure involves constraining the norm of the combination coefficients to be 1 for identifiability, this constraint was also applied to the logistic regression results.
`Nobs`	The number of observations remaining after observations with missing values were removed.

For all three methods, the combination coefficients are reported in the same order as the columns of data.

Meisner, A., Carone, M., Pepe, M., and Kerr, K.F. (2017). Combining biomarkers by maximizing the true positive rate for a fixed false positive rate. UW Biostatistics Working Paper Series, Working Paper 420.

Bianco, A.M. and Yohai, V.J. (1996) Robust estimation in the logistic regression model. In Robust statistics, data analysis, and computer intensive methods (ed H. Rieder), pp 17-34. Springer.

rlogit, solnp

set.seed(4)
x1 <- rnorm(400)
x2 <- rnorm(400)
y <- rbinom(400,1,exp(x1+x2)/(1+exp(x1+x2)))
data <- data.frame(y,x1,x2)
maxTPR(data,0.2)

maxTPR documentation built on May 1, 2019, 8:41 p.m.

maxTPR index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

maxTPR
Maximizing the TPR for a Specified FPR

maxTPR: Maximizing the TPR for a Specified FPR
In maxTPR: Maximizing the TPR for a Specified FPR

Description

Usage

Arguments

Value

References

See Also

Examples

Related to maxTPR in maxTPR...

R Package Documentation

Browse R Packages

We want your feedback!

maxTPR Maximizing the TPR for a Specified FPR

maxTPR: Maximizing the TPR for a Specified FPR In maxTPR: Maximizing the TPR for a Specified FPR

Description

Usage

Arguments

Value

References

See Also

Examples

Related to maxTPR in maxTPR...

R Package Documentation

Browse R Packages

We want your feedback!

maxTPR
Maximizing the TPR for a Specified FPR

maxTPR: Maximizing the TPR for a Specified FPR
In maxTPR: Maximizing the TPR for a Specified FPR