Home

/

GitHub

/

lihualei71/cfcausal

/

conformalCf: Conformal inference for counterfactuals

conformalCf: Conformal inference for counterfactuals
In lihualei71/cfcausal: Conformal inference of counterfactuals and individual treatment effects

View source: R/conformalCf.R

conformalCf

R Documentation

Conformal inference for counterfactuals

Description

conformalCf computes intervals for counterfactuals or outcomes with ignorable missing values in general. It supports both split conformal inference and CV+, including weighted Jackknife+ as a special case. For each type, it supports both conformalized quantile regression (CQR) and standard conformal inference based on conditional mean regression.

Usage

conformalCf(
  X,
  Y,
  estimand = c("unconditional", "nonmissing", "missing"),
  type = c("CQR", "mean"),
  side = c("two", "above", "below"),
  quantiles = NULL,
  outfun = NULL,
  outparams = list(),
  psfun = NULL,
  psparams = list(),
  useCV = FALSE,
  trainprop = 0.75,
  nfolds = 10
)

Arguments

`X`	covariates.
`Y`	outcome vector with missing values encoded as NA. See Details.
`estimand`	a string that takes values in {"unconditional", "nonmissing", "missing"}. See Details.
`type`	a string that takes values in {"CQR", "mean"}.
`side`	a string that takes values in {"two", "above", "below"}. See Details.
`quantiles`	a scalar or a vector of length 2 depending on `side`. Used only when `type = "CQR"`. See Details.
`outfun`	a function that models the conditional mean or quantiles, or a valid string. The default is random forest when `type = "mean"` and quantile random forest when `type = "CQR"`. See Details.
`outparams`	a list of other parameters to be passed into `outfun`.
`psfun`	a function that models the missing mechanism (probability of missing given X), or a valid string. The default is "Boosting". See Details.
`psparams`	a list of other parameters to be passed into `psfun`.
`useCV`	FALSE for split conformal inference and TRUE for CV+.
`trainprop`	proportion of units for training `outfun`. The default if 75%. Used only when `useCV = FALSE`.
`nfolds`	number of folds. The default is 10. Used only when `useCV = TRUE`.

Details

The outcome Y must comprise both observed values and missing values encoded as NA. The missing values are used to estimate the propensity score P(missing | X).

estimand controls the type of coverage to be guaranteed:

(Default) when estimand = "unconditional", the interval has P(Y \in \hat{C}(X))≥ 1 - α.
When estimand = "nonmissing", the interval has P(Y \in \hat{C}(X) | nonmissing) ≥ 1 - α.
When estimand = "missing", the interval has P(Y \in \hat{C}(X) | missing) ≥ 1 - α.

When side = "above", intervals are of form [-Inf, a(x)] and when side = "below" the intervals are of form [a(x), Inf].

outfun can be a valid string, including

"RF" for random forest that predicts the conditional mean, a wrapper built on randomForest package. Used when type = "mean".
"quantRF" for quantile random forest that predicts the conditional quantiles, a wrapper built on grf package. Used when type = "CQR".
"Boosting" for gradient boosting that predicts the conditional mean, a wrapper built on gbm package. Used when type = "mean".
"quantBoosting" for quantile gradient boosting that predicts the conditional quantiles, a wrapper built on gbm package. Used when type = "CQR".
"BART" for gradient boosting that predicts the conditional mean, a wrapper built on bartMachine package. Used when type = "mean".
"quantBART" for quantile gradient boosting that predicts the conditional quantiles, a wrapper built on bartMachine package. Used when type = "CQR".

or a function object whose input must include, but not limited to

Y for outcome in the training data.
X for covariates in the training data.
Xtest for covariates in the testing data.

When type = "CQR", outfun should also include an argument quantiles that is either a vector of length 2 or a scalar, depending on the argument side. The output of outfun must be a matrix with two columns giving the conditional quantile estimates when quantiles is a vector of length 2; otherwise, it must be a vector giving the conditional quantile estimate or conditional mean estimate. Other optional arguments can be passed into outfun through outparams.

psfun can be a valid string, including

"RF" for random forest that predicts the propensity score, a wrapper built on randomForest package. Used when type = "mean".
"Boosting" for gradient boosting that predicts the propensity score, a wrapper built on gbm package. Used when type = "mean".

or a function object whose input must include, but not limited to

Y for treatment assignment, a binary vector, in the training data.
X for covariates in the training data.
Xtest for covariates in the testing data.

The output of psfun must be a vector of predicted probabilities. Other optional arguments can be passed into psfun through psparams.

Value

a conformalSplit object when useCV = FALSE or a conformalCV object

Examples

# Generate data from a linear model
set.seed(1)
n <- 1000
d <- 5
X <- matrix(rnorm(n * d), nrow = n)
beta <- rep(1, 5)
Y <- X %*% beta + rnorm(n)

# Generate missing indicators
missing_prob <- pnorm(X[, 1])
if_missing <- missing_prob < runif(n)
Y[if_missing] <- NA

# Generate testing data
ntest <- 5
Xtest <- matrix(rnorm(ntest * d), nrow = ntest)

# Run weighted split CQR
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
                   outfun = "quantRF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

# Run weighted standard conformal inference
obj <- conformalCf(X, Y, type = "mean",
                   outfun = "RF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

# Run one-sided weighted split CQR
obj1 <- conformalCf(X, Y, type = "CQR", side = "above",
                    quantiles = 0.95, outfun = "quantRF", useCV = FALSE)
predict(obj1, Xtest, alpha = 0.1)
obj2 <- conformalCf(X, Y, type = "CQR", side = "below",
                    quantiles = 0.05, outfun = "quantRF", useCV = FALSE)
predict(obj2, Xtest, alpha = 0.1)

# Run split CQR with a self-defined quantile random forest
# Y, X, Xtest, quantiles should be included in the inputs
quantRF <- function(Y, X, Xtest, quantiles, ...){
    fit <- grf::quantile_forest(X, Y, quantiles = quantiles, ...)
    res <- predict(fit, Xtest, quantiles = quantiles)
    if (is.list(res) && !is.data.frame(res)){
    # for the recent update of \code{grf} package that
    # changes the output format
        res <- res$predictions 
    }
    if (length(quantiles) == 1){
        res <- as.numeric(res)
    } else {
        res <- as.matrix(res)
    }
    return(res)
}
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
                   outfun = quantRF, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

# Run standard split conformal inference with a self-defined linear regression
# Y, X, Xtest should be included in the inputs
linearReg <- function(Y, X, Xtest){
    X <- as.data.frame(X)
    Xtest <- as.data.frame(Xtest)
    data <- data.frame(Y = Y, X)
    fit <- lm(Y ~ ., data = data)
    as.numeric(predict(fit, Xtest))
}
obj <- conformalCf(X, Y, type = "mean",
                   outfun = linearReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

# Run split CQR with a built-in psfun
# Y, X, Xtest, should be included in the inputs
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
                   outfun = "quantRF", psfun = "RF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

# Run split CQR with a self-defined function to estimate propensity scores
# Y, X, Xtest, should be included in the inputs
logitReg <- function(Y, X, Xtest, ...){
    X <- as.data.frame(X)
    Xtest <- as.data.frame(Xtest)
    data <- data.frame(Y = Y, X)
    fit <- glm(Y ~ ., data = data, family = "binomial", ...)
    as.numeric(predict(fit, Xtest, type = "response"))
}
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
                   outfun = "quantRF", psfun = logitReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)

lihualei71/cfcausal documentation built on Jan. 7, 2023, 1:34 p.m.

lihualei71/cfcausal index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lihualei71/cfcausal
Conformal inference of counterfactuals and individual treatment effects

conformalCf: Conformal inference for counterfactuals
In lihualei71/cfcausal: Conformal inference of counterfactuals and individual treatment effects

Conformal inference for counterfactuals

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to conformalCf in lihualei71/cfcausal...

R Package Documentation

Browse R Packages

We want your feedback!

lihualei71/cfcausal Conformal inference of counterfactuals and individual treatment effects

conformalCf: Conformal inference for counterfactuals In lihualei71/cfcausal: Conformal inference of counterfactuals and individual treatment effects

Conformal inference for counterfactuals

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to conformalCf in lihualei71/cfcausal...

R Package Documentation

Browse R Packages

We want your feedback!

lihualei71/cfcausal
Conformal inference of counterfactuals and individual treatment effects

conformalCf: Conformal inference for counterfactuals
In lihualei71/cfcausal: Conformal inference of counterfactuals and individual treatment effects