CVonestep | R Documentation |
Method to compute a confidence upper bound of true coverage or select a threshold (via grid search) for APAC prediction sets based on cross-fit one-step corrected estimators
CVonestep( A, X, Y, scores, candidate.tau, error.bound = 0.05, conf.level = 0.95, nfolds = 5, g.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")), Q.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")), g.trunc = 0.01, select.tau = ifelse(length(candidate.tau) == 1, FALSE, TRUE) )
A |
vector of population indicator. 1 for source population, 0 for target population |
X |
data frame of covariates with each row being one observation |
Y |
vector of dependent variable/outcome. For data from the target population ( |
scores |
either a function assigning scores of |
candidate.tau |
a numeric vector of candidate thresholds, default to |
error.bound |
desired bound on the prediction set coverage error between 0 and 1, default 0.05 |
conf.level |
desired level of confidence of low coverage error between 0.5 and 1, default to 0.95 |
nfolds |
number of folds for sample splitting, default to 5 |
g.control |
a named list containing options passed to |
Q.control |
a named list containing options passed to |
g.trunc |
truncation level of propensity score |
select.tau |
whether to select threshold tau (otherwise just reposrt estimates and confidence upper bounds of coverage error for all |
If select.tau==FALSE
, then a list with the following components:
tau
Input tau
error.CI.upper
The (approximate) confidence upper bound of coverage error corresponding to the input tau
error.est
The point estimate of coverage error corresponding to the input tau
Otherwise a list with the following components:
tau
Selected threshold tau, the maximal tau with (approximate) confidence upper bound of coverage error lower than error.bound
error.CI.upper
The (approximate) confidence upper bound of coverage error corresponding to the selected tau
error.est
The point estimate of coverage error corresponding to the selected tau
feasible.tau
The set of feasible thresholds tau defined by (approximate) confidence upper bounds of coverage errors being lower than error.bound
feasible.tau.error.CI.upper
The (approximate) confidence upper bounds of coverage errors corresponding to feasible.tau
feasible.tau.error.est
The point estimates of coverage errors corresponding to feasible.tau
When extremely small/large thresholds are included in candidata.tau
, it is common to receive warnings/errors from the machine learning algorithms used by SuperLearner::SuperLearner
, because in such cases, almost all Y
are included in (for small thresholds) or excluded from (for large thresholds) the corresponding prediction sets, leading to complaints from machine learning algorithms. This is usually not an issue because the resulting predictions are still quite accurate. We also strongly encourage the user to specify a lerner that can deal with such cases (e.g., SL.glm
) in Q.control
.
n<-100 expit<-function(x) 1/(1+exp(-x)) A<-rbinom(n,1,.5) X<-data.frame(X=rnorm(n,sd=ifelse(A==1,1,.5))) Y<-rbinom(n,1,expit(1+X$X)) scores<-dbinom(Y,1,expit(.08+1.1*X$X)) candidate.tau<-seq(0,.5,length.out=10) CVonestep(A,X,Y,scores,candidate.tau,nfolds=2, g.control=list(SL.library="SL.glm"), Q.control=list(SL.library="SL.glm"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.