Home

/

GitHub

/

QIU-Hongxiang-David/APACpredset

/

CVonestep: Confidence upper bound of true coverage error or threshold...

CVonestep: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

View source: R/CVonestep.R

CVonestep

R Documentation

Confidence upper bound of true coverage error or threshold selection based on cross-fit one-step corrected estimator (via grid search)

Description

Method to compute a confidence upper bound of true coverage or select a threshold (via grid search) for APAC prediction sets based on cross-fit one-step corrected estimators

Usage

CVonestep(
  A,
  X,
  Y,
  scores,
  candidate.tau,
  error.bound = 0.05,
  conf.level = 0.95,
  nfolds = 5,
  g.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")),
  Q.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")),
  g.trunc = 0.01,
  select.tau = ifelse(length(candidate.tau) == 1, FALSE, TRUE)
)

Arguments

`A`	vector of population indicator. 1 for source population, 0 for target population
`X`	data frame of covariates with each row being one observation
`Y`	vector of dependent variable/outcome. For data from the target population (`A=0`), set the corresponding entries of `Y` to be `NA`
`scores`	either a function assigning scores of `Y` given `X` trained using an independent dataset from source population or a vector of this function evaluated at observed `(X,Y)`, taking `NA` for observations from the target population. If it is a function, it must take input `(x,y)`, where `x` is one row of `X` (a data frame with one row) and `y` is a nonmissing value of `Y`, and output a scalar
`candidate.tau`	a numeric vector of candidate thresholds, default to `c(scores,Inf)` (after `scores` is evaluated at observations if `scores` is a function).
`error.bound`	desired bound on the prediction set coverage error between 0 and 1, default 0.05
`conf.level`	desired level of confidence of low coverage error between 0.5 and 1, default to 0.95
`nfolds`	number of folds for sample splitting, default to 5
`g.control`	a named list containing options passed to `SuperLearner::SuperLearner` to estimate propensity score `g`. Must not specify `Y`, `X`, `newX` or `family`. Default to `list(SL.library=c("SL.glm","SL.gam","SL.randomForest"))`
`Q.control`	a named list containing options passed to `SuperLearner::SuperLearner` to estimate conditional coverage error `Q`. Must not specify `Y`, `X`, `newX` or `family`. Default to `list(SL.library=c("SL.glm","SL.gam","SL.randomForest"))`
`g.trunc`	truncation level of propensity score `g` from zero, default to 0.01
`select.tau`	whether to select threshold tau (otherwise just reposrt estimates and confidence upper bounds of coverage error for all `candidate.tau`), default to `TRUE` if `length(candidate.tau)>1` and `FALSE` if `length(candidate.tau)==1`

Value

If select.tau==FALSE, then a list with the following components:

tau: Input tau
error.CI.upper: The (approximate) confidence upper bound of coverage error corresponding to the input tau
error.est: The point estimate of coverage error corresponding to the input tau

Otherwise a list with the following components:

tau: Selected threshold tau, the maximal tau with (approximate) confidence upper bound of coverage error lower than error.bound
error.CI.upper: The (approximate) confidence upper bound of coverage error corresponding to the selected tau
error.est: The point estimate of coverage error corresponding to the selected tau
feasible.tau: The set of feasible thresholds tau defined by (approximate) confidence upper bounds of coverage errors being lower than error.bound
feasible.tau.error.CI.upper: The (approximate) confidence upper bounds of coverage errors corresponding to feasible.tau
feasible.tau.error.est: The point estimates of coverage errors corresponding to feasible.tau

Warnings/Errors due to extreme candidate thresholds

When extremely small/large thresholds are included in candidata.tau, it is common to receive warnings/errors from the machine learning algorithms used by SuperLearner::SuperLearner, because in such cases, almost all Y are included in (for small thresholds) or excluded from (for large thresholds) the corresponding prediction sets, leading to complaints from machine learning algorithms. This is usually not an issue because the resulting predictions are still quite accurate. We also strongly encourage the user to specify a lerner that can deal with such cases (e.g., SL.glm) in Q.control.

Examples

n<-100
expit<-function(x) 1/(1+exp(-x))
A<-rbinom(n,1,.5)
X<-data.frame(X=rnorm(n,sd=ifelse(A==1,1,.5)))
Y<-rbinom(n,1,expit(1+X$X))
scores<-dbinom(Y,1,expit(.08+1.1*X$X))
candidate.tau<-seq(0,.5,length.out=10)
CVonestep(A,X,Y,scores,candidate.tau,nfolds=2,
          g.control=list(SL.library="SL.glm"),
          Q.control=list(SL.library="SL.glm"))

QIU-Hongxiang-David/APACpredset documentation built on Aug. 11, 2022, 12:53 p.m.

QIU-Hongxiang-David/APACpredset index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

QIU-Hongxiang-David/APACpredset
Asymptotically Probably Approximately Correct prediction sets

CVonestep: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

Confidence upper bound of true coverage error or threshold selection based on cross-fit one-step corrected estimator (via grid search)

Description

Usage

Arguments

Value

Warnings/Errors due to extreme candidate thresholds

Examples

Related to CVonestep in QIU-Hongxiang-David/APACpredset...

R Package Documentation

Browse R Packages

We want your feedback!

QIU-Hongxiang-David/APACpredset Asymptotically Probably Approximately Correct prediction sets

CVonestep: Confidence upper bound of true coverage error or threshold... In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

Confidence upper bound of true coverage error or threshold selection based on cross-fit one-step corrected estimator (via grid search)

Description

Usage

Arguments

Value

Warnings/Errors due to extreme candidate thresholds

Examples

Related to CVonestep in QIU-Hongxiang-David/APACpredset...

R Package Documentation

Browse R Packages

We want your feedback!

QIU-Hongxiang-David/APACpredset
Asymptotically Probably Approximately Correct prediction sets

CVonestep: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets