Home

/

GitHub

/

RS: Confidence upper bound of true coverage error or threshold...

RS: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

RS	R Documentation

Confidence upper bound of true coverage error or threshold selection based on Rejection Sampling & Binomial Proportion confidence upper bound

Description

Method to compute a confidence upper bound of true coverage or select a threshold for APAC prediction sets based on Rejection Sampling & Binomial Proportion confidence upper bound

Usage

RS(
  A,
  X,
  Y,
  scores,
  candidate.tau,
  LR.bound = NULL,
  error.bound = 0.05,
  conf.level = 0.95,
  train.prop = 0.5,
  g.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")),
  Q.control = list(SL.library = c("SL.glm", "SL.gam", "SL.randomForest")),
  g.trunc = 0.01,
  select.tau = ifelse(length(candidate.tau) == 1, FALSE, TRUE)
)

Arguments

`A`	vector of population indicator. 1 for source population, 0 for target population
`X`	data frame of covariates with each row being one observation
`Y`	vector of dependent variable/outcome. For data from the target population (`A=0`), set the corresponding entries of `Y` to be `NA`
`scores`	either a function assigning scores of `Y` given `X` trained using an independent dataset from source population or a vector of this function evaluated at observed `(X,Y)`, taking `NA` for observations from the target population. If it is a function, it must take input `(x,y)`, where `x` is one row of `X` (a data frame with one row) and `y` is a nonmissing value of `Y`, and output a scalar
`candidate.tau`	a numeric vector of candidate thresholds, default to `c(scores,Inf)` (after `scores` is evaluated at observations if `scores` is a function). If `candidate.tau` has length 1, then just compute the point estimate and confidence upper bound of true coverage error of this threshold.
`LR.bound`	known upper bound on likelihood ratio between target population and source population. As long as `LR.bound` is a valid upper bound, smaller values lead to better performance. If is `NULL`, will use an ad hoc choice, the maximum value of estimated likelihood ratio at observations in the testing data. Default to `NULL`
`error.bound`	desired bound on the prediction set coverage error between 0 and 1, default 0.05
`conf.level`	desired level of confidence of low coverage error between 0.5 and 1, default to 0.95
`train.prop`	proportion of training data used to estimate nuisance functions, default to 0.5
`g.control`	a named list containing options passed to `SuperLearner::SuperLearner` to estimate propensity score `g`. Must not specify `Y`, `X`, `newX` or `family`. Default to `list(SL.library=c("SL.glm","SL.gam","SL.randomForest"))`
`Q.control`	a named list containing options passed to `SuperLearner::SuperLearner` to estimate conditional coverage error `Q`. Must not specify `Y`, `X`, `newX` or `family`. Default to `list(SL.library=c("SL.glm","SL.gam","SL.randomForest"))`
`g.trunc`	truncation level of propensity score `g` from zero, default to 0.01
`select.tau`	whether to select threshold tau (otherwise just reposrt estimates and confidence upper bounds of coverage error for all `candidate.tau`), default to `TRUE` if `length(candidate.tau)>1` and `FALSE` if `length(candidate.tau)==1`

Value

If select.tau==FALSE, then a list with the following components:

tau: Input tau
error.CI.upper: The (approximate) confidence upper bound of coverage error corresponding to the input tau
error.est: The point estimate of coverage error corresponding to the input tau

Otherwise a list with the following components:

tau: Selected threshold tau, the maximal tau with (approximate) confidence upper bound of coverage error lower than error.bound
error.CI.upper: The (approximate) confidence upper bound of coverage error corresponding to the selected tau
error.est: The point estimate of coverage error corresponding to the selected tau
feasible.tau: The set of feasible thresholds tau defined by (approximate) confidence upper bounds of coverage errors being lower than error.bound
feasible.tau.error.CI.upper: The (approximate) confidence upper bounds of coverage errors corresponding to feasible.tau
feasible.tau.error.est: The point estimates of coverage errors corresponding to feasible.tau

Warnings/Errors due to extreme candidate thresholds

When extremely small/large thresholds are included in candidata.tau, it is common to receive warnings/errors from the machine learning algorithms used by SuperLearner::SuperLearner, because in such cases, almost all Y are included in (for small thresholds) or excluded from (for large thresholds) the corresponding prediction sets, leading to complaints from machine learning algorithms. This is usually not an issue because the resulting predictions are still quite accurate.

Examples

n<-100
expit<-function(x) 1/(1+exp(-x))
A<-rbinom(n,1,.5)
X<-data.frame(X=rnorm(n,sd=ifelse(A==1,1,.5)))
Y<-rbinom(n,1,expit(1+X$X))
scores<-dbinom(Y,1,expit(.08+1.1*X$X))
candidate.tau<-seq(0,.5,length.out=10)
LR.bound<-4
RS(A,X,Y,scores,candidate.tau,LR.bound,
   g.control=list(SL.library="SL.glm"),
   Q.control=list(SL.library="SL.glm"))

QIU-Hongxiang-David/APACpredset documentation built on Aug. 11, 2022, 12:53 p.m.

QIU-Hongxiang-David/APACpredset index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

QIU-Hongxiang-David/APACpredset
Asymptotically Probably Approximately Correct prediction sets

RS: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

Confidence upper bound of true coverage error or threshold selection based on Rejection Sampling & Binomial Proportion confidence upper bound

Description

Usage

Arguments

Value

Warnings/Errors due to extreme candidate thresholds

Examples

Related to RS in QIU-Hongxiang-David/APACpredset...

R Package Documentation

Browse R Packages

We want your feedback!

QIU-Hongxiang-David/APACpredset Asymptotically Probably Approximately Correct prediction sets

RS: Confidence upper bound of true coverage error or threshold... In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets

Confidence upper bound of true coverage error or threshold selection based on Rejection Sampling & Binomial Proportion confidence upper bound

Description

Usage

Arguments

Value

Warnings/Errors due to extreme candidate thresholds

Examples

Related to RS in QIU-Hongxiang-David/APACpredset...

R Package Documentation

Browse R Packages

We want your feedback!

QIU-Hongxiang-David/APACpredset
Asymptotically Probably Approximately Correct prediction sets

RS: Confidence upper bound of true coverage error or threshold...
In QIU-Hongxiang-David/APACpredset: Asymptotically Probably Approximately Correct prediction sets