Type Package
Title Targeted Gold Standard Testing
Version 1.0
Date "r Sys.Date()
"
Authors Yizhen Xu, Tao Liu
Maintainer Yizhen (yizhen_xu@alumni.brown.edu)
Description This package implements the optimal allocation of gold standard testing under constrained availability.
License GPL
URL https://github.com/yizhenxu/TGST
Depends R (>= 3.2.0)
LazyData true
Create a TGST Object
Create a TGST object, usually used as an input for optimal rule search and ROC analysis.
TGST( Z, S, phi, method="nonpar")
- Z A vector of true disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
- phi Percentage of patients taking gold standard test.
- method Method for searching for the optimal tripartite rule, options are "nonpar" (default) and "semipar".
An object of class TGST.The class contains 6 slots: phi (percentage of gold standard tests), Z (true failure status), S (risk score), Rules (all possible tripartite rules), Nonparametric (logical indicator of the approach), and FNR.FPR (misclassification rates).
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test TGST( Z, S, phi, method="nonpar")
Check exponential tilt model assumption
This function provides graphical assessment to the suitability of the exponential tilt model for risk score in finding optimal tripartite rules by semiparametric approach. $$g_1(s) = exp(\beta_0^+\beta_1s)*g_0(s)$$
Check.exp.tilt( Z, S)
- Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
Returns the plot of empirical density for risk score S, joint empirical density for (S,Z=1) and (S,Z=0), and the density under the exponential tilt model assumption for (S,Z=1) and (S,Z=0).
Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score Check.exp.tilt( Z, S)
Cross Validation
This function allows you to compute the average of misdiagnoses rate for viral failure and the optimal risk under min $\lambda$ rules from K-fold cross-validation.
CV.TGST(Obj, lambda, K=10)
- Obj An object of class TGST.
- lambda A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in [0,1]. $Loss=\lambdaI(FN)+(1-\lambda)I(FP)$.
- K Number of folds in cross validation. The default is 10.
Cross validated results on false classification rates (FNR, FPR), $\lambda-$ risk, total misclassification rate and AUC.
Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
data = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test Obj = TVLT(Z, S, phi, method="nonpar") CV.TGST(Obj, lambda, K=10)
Optimal Tripartite Rule
This function gives you the optimal tripartite rule that minimizes the min-$\lambda$ risk based on the type of user selected approach. It takes the risk score and true disease status from a training data set and returns the optimal tripartite rule under the specified proportion of patients able to take gold standard test.
OptimalRule(Obj, lambda)
- Z
- Obj An object of class TGST.
- lambda A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in [0,1]. $Loss=\lambdaI(FN)+(1-\lambda)I(FP)$.
Optimal tripartite rule.
Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") OptimalRule(Obj, lambda)
ROC Analysis
This function performs ROC analysis for tripartite rules. If 'plot=TRUE', the ROC curve is returned.
ROCAnalysis(Obj, plot=TRUE)
- Obj An object of class TGST.
- plot Logical parameter indicating if ROC curve should be plotted. Default is 'plot=TRUE'. If false, then only AUC is calculated.
AUC (the area under ROC curve) and ROC curve.
Yizhen Xu (yizhen_xu@alumni.brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") ROCAnalysis(Obj, plot=TRUE)
Nonparametric Rules Set
This function gives you all possible cutoffs [l,u] for tripartite rules, by applying nonparametric search to the given data. $$P(S \in [l,u]) \le \phi$$
nonpar.rules( Z, S, phi)
- Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
- phi Percentage of patients taking viral load test.
Matrix with 2 columns. Each row is a possible tripartite rule, with output on lower and upper cutoff.
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10\% of patients taking viral load test nonpar.rules( Z, S, phi)
Nonparametric FNR FPR of the rules
This function gives you the nonparametric FNRs and FPRs associated with a given set of tripartite rules.
nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
- Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
- l Lower cutoff of tripartite rule.
- u Upper cutoff of tripartite rule.
Matrix with 2 columns. Each row is a set of nonparametric (FNR, FPR) on an associated tripartite rule.
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10\% of patients taking viral load test rules = nonpar.rules( Z, S, phi) nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
Semiparametric FNR FPR of the rules
This function gives you the semiparametric FNR and FPR associated with a set of given tripartite rules.
semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
- Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
- l Lower cutoff of tripartite rule.
- u Upper cutoff of tripartite rule.
Matrix with 2 columns. Each row is a set of semiparametric (FNR, FPR) on an associated tripartite rule.
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10\% of patients taking viral load test rules = nonpar.rules( Z, S, phi) semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
Calculate AUC
This function gives you the AUC associated with the rules set.
cal.AUC(Z,S,rules[,1],rules[,2])
- Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
- S Risk score.
- l Lower cutoff of tripartite rule.
- u Upper cutoff of tripartite rule.
AUC.
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) cal.AUC(Z,S,rules[,1],rules[,2])
Simulated data for package illustration
A simulated dataset containing true disease status and risk score. See details for simulation setting.
A data frame with 8000 simulated observations on the following 2 variables. - Z True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). - S Risk score. Higher risk score indicates larger tendency of diseased / treatment failure.
We first simulate true failure status $Z$ assuming $Z\sim Bernoulli(p)$ with $p=0.25$; and then conditional on $Z$, simulate ${S|Z=z}=ceiling(W)$ with $W\sim Gamma(\eta_z,\kappa_z)$ where $\eta$ and $\kappa$ are shape and scale parameters.$(\eta_0,\kappa_0)=(2.3,80)$ and $(\eta_1,\kappa_1)=(9.2,62)$.
Yizhen Xu (yizhen_xu@brown.edu), Tao Liu, Joseph Hogan
T. Liu, J. Hogan, L. Wang, S. Zhang, R. Kantor (2013) Journal of the American Statistical Association Vol.108, No.504
data(Simdata) summary(Simdata) plot(Simdata)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.