| conformal | R Documentation |
conformal is a framework for weighted and unweighted conformal inference for continuous
outcomes. It supports both weighted split conformal inference and weighted CV+,
including weighted Jackknife+ as a special case. For each type, it supports both conformalized
quantile regression (CQR) and standard conformal inference based on conditional mean estimation.
conformal(
X,
Y,
type = c("CQR", "mean"),
side = c("two", "above", "below"),
quantiles = NULL,
outfun = NULL,
outparams = list(),
wtfun = NULL,
useCV = FALSE,
trainprop = 0.75,
trainid = NULL,
nfolds = 10,
idlist = NULL
)
X |
covariates. |
Y |
outcome vector. |
type |
a string that takes values in {"CQR", "mean"}. |
side |
a string that takes values in {"two", "above", "below"}. See Details. |
quantiles |
a scalar or a vector of length 2 depending on |
outfun |
a function that models the conditional mean/quantiles, or a valid string.
The default is random forest when |
outparams |
a list of other parameters to be passed into |
wtfun |
NULL for unweighted conformal inference, or a function for weighted conformal inference
when |
useCV |
FALSE for split conformal inference and TRUE for CV+. |
trainprop |
proportion of units for training |
trainid |
indices of training units. The default is NULL, generating random indices. Used only when |
nfolds |
number of folds. The default is 10. Used only when |
idlist |
a list of indices of length |
When side = "two", CQR (two-sided) produces intervals in the form of
[q_{α_{lo}}(x) - η, q_{α_{hi}}(x) + η]
where q_{α_{lo}}(x) and q_{α_{hi}}(x) are estimates of conditional quantiles of Y given X and the standard conformal inference produces (two-sided) intervals in the form of
[m(x) - η, m(x) + η]
where m(x) is an estimate of conditional mean/median of Y given X. When side = "above",
intervals are of form [-Inf, a(x)] and when side = "below" the intervals are of form [a(x), Inf].
quantiles should be given when type = "CQR". When side = "two", quantiles
should be a vector of length 2, giving α_{lo} and α_{hi}. When side = "above"
or side = "below", only one quantile should be given.
outfun can be a valid string, including
"RF" for random forest that predicts the conditional mean, a wrapper built on randomForest package.
Used when type = "mean".
"quantRF" for quantile random forest that predicts the conditional quantiles, a wrapper built on
grf package. Used when type = "CQR".
"Boosting" for gradient boosting that predicts the conditional mean, a wrapper built on gbm
package. Used when type = "mean".
"quantBoosting" for quantile gradient boosting that predicts the conditional quantiles, a wrapper built on
gbm package. Used when type = "CQR".
"BART" for gradient boosting that predicts the conditional mean, a wrapper built on bartMachine
package. Used when type = "mean".
"quantBART" for quantile gradient boosting that predicts the conditional quantiles, a wrapper built on
bartMachine package. Used when type = "CQR".
or a function object whose input must include, but not limited to
Y for outcome in the training data.
X for covariates in the training data.
Xtest for covariates in the testing data.
When type = "CQR", outfun should also include an argument quantiles that is either
a vector of length 2 or a scalar, depending on the argument side. The output of outfun must be a matrix with two columns giving the conditional quantile estimates when quantiles is a vector of length 2; otherwise, it must be a vector giving the conditional quantile estimate or conditional mean estimate. Other optional arguments can be
passed into outfun through outparams.
wtfun is NULL for unweighted conformal inference. For weighted split conformal inference, it is a
function with a required input X that produces a vector of non-negative reals of length nrow(X).
For weighted CV+, it can be a function as in the case useCV = FALSE so that the same function will
apply to each fold, or a list of functions of length nfolds so that wtfun[[k]] is applied to fold k.
a conformalSplit object when useCV = FALSE with the following attributes:
Yscore: a vector of non-conformity score on the calibration fold
wt: a vector of weights on the calibration fold
Ymodel: a function with required argument X that produces the estimates the conditional
mean or quantiles of X
wtfun, type, side, quantiles, trainprop, trainid: the same as inputs
or a conformalCV object when useCV = TRUE with the following attributes:
info: a list of length nfolds with each element being a list with attributes
Yscore, wt and Ymodel described above for each fold
wtfun, type, side, quantiles, nfolds, idlist: the same as inputs
predict.conformalSplit, predict.conformalCV.
# Generate data from a linear model
set.seed(1)
n <- 1000
d <- 5
X <- matrix(rnorm(n * d), nrow = n)
beta <- rep(1, 5)
Y <- X %*% beta + rnorm(n)
# Generate testing data
ntest <- 5
Xtest <- matrix(rnorm(ntest * d), nrow = ntest)
# Run unweighted split CQR with the built-in quantile random forest learner
# grf package needs to be installed
obj <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = NULL, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run unweighted standard split conformal inference with the built-in random forest learner
# randomForest package needs to be installed
obj <- conformal(X, Y, type = "mean",
outfun = "RF", wtfun = NULL, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run unweighted CQR-CV+ with the built-in quantile random forest learner
# grf package needs to be installed
obj <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = NULL, useCV = TRUE)
predict(obj, Xtest, alpha = 0.1)
# Run unweighted standard CV+ with the built-in random forest learner
# randomForest package needs to be installed
obj <- conformal(X, Y, type = "mean",
outfun = "RF", wtfun = NULL, useCV = TRUE)
predict(obj, Xtest, alpha = 0.1)
# Run weighted split CQR with w(x) = pnorm(x1)
wtfun <- function(X){pnorm(X[, 1])}
obj <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = wtfun, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run unweighted split CQR with a self-defined quantile random forest
# Y, X, Xtest, quantiles should be included in the inputs
quantRF <- function(Y, X, Xtest, quantiles, ...){
fit <- grf::quantile_forest(X, Y, quantiles = quantiles, ...)
res <- predict(fit, Xtest, quantiles = quantiles)
if (is.list(res) && !is.data.frame(res)){
# for the recent update of \code{grf} package that
# changes the output format
res <- res$predictions
}
if (length(quantiles) == 1){
res <- as.numeric(res)
} else {
res <- as.matrix(res)
}
return(res)
}
obj <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = quantRF, wtfun = NULL, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run unweighted standard split conformal inference with a self-defined linear regression
# Y, X, Xtest should be included in the inputs
linearReg <- function(Y, X, Xtest){
X <- as.data.frame(X)
Xtest <- as.data.frame(Xtest)
data <- data.frame(Y = Y, X)
fit <- lm(Y ~ ., data = data)
as.numeric(predict(fit, Xtest))
}
obj <- conformal(X, Y, type = "mean",
outfun = linearReg, wtfun = NULL, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run weighted split-CQR with user-defined weights
wtfun <- function(X){
pnorm(X[, 1])
}
obj <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = wtfun, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
# Run weighted CQR-CV+ with user-defined weights
# Use a list of identical functions
set.seed(1)
wtfun_list <- lapply(1:10, function(i){wtfun})
obj1 <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = wtfun_list, useCV = TRUE)
predict(obj1, Xtest, alpha = 0.1)
# Use a single function. Equivalent to the above approach
set.seed(1)
obj2 <- conformal(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", wtfun = wtfun, useCV = TRUE)
predict(obj2, Xtest, alpha = 0.1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.