cate: CATE

View source: R/cate.R

cateR Documentation

CATE

Description

This function estimates heterogeneous treatment effects (HTEs) defined as E(Y^1 - Y^0 | V = v0).

Usage

cate(
  data,
  learner,
  x_names,
  y_name,
  a_name,
  v_names,
  v0,
  mu1.x,
  mu0.x,
  pi.x,
  drl.v,
  drl.x,
  nsplits = 5,
  foldid = NULL,
  univariate_reg = FALSE,
  partial_dependence = FALSE,
  partially_linear = FALSE,
  additive_approx = FALSE,
  variable_importance = FALSE,
  vimp_num_splits = 1,
  bw.stage2 = NULL,
  sample.split.cond.dens = FALSE,
  cond.dens = NULL,
  cate.w = NULL,
  cate.not.j = NULL,
  reg.basis.not.j = NULL,
  pl.dfs = NULL
)

Arguments

data

A data frame containing the dataset.

learner

A character string specifying which learner to use (e.g., "dr").

x_names

A character vector specifying the names of the confouding variables.

y_name

A character string specifying the outcome variable.

a_name

A character string specifying the treatment variable.

v_names

A character vector specifying the names of the effect modifiers.

v0

A matrix of evaluation points, i.e., values of V for which the CATE is estimated (E(Y^1 - Y^0 | V = v0)).

mu1.x

A function taking arguments (y, a, x, new.x). It trains a model estimating E(Y | A = 1, X) and returns a list of 3 elements: res, model and fit. res is a vector of predictions of the model evaluated at new.x, model is the model object used to estimate E(Y | A = 1, X) and fit is a function with argument new.x that returns the predictions of the model. See examples.

mu0.x

A function taking arguments (y, a, x, new.x). It trains a model estimating E(Y | A = 0, X) and returns a list of 3 elements: res, model and fit. res is a vector of predictions of the model evaluated at new.x, model is the model object used to estimate E(Y | A = 0, X) and fit is a function with argument new.x that returns the predictions of the model. See examples.

pi.x

A function taking arguments (a, x, new.x). It trains a model estimating P(A = 1 | X) and returns a list of 3 elements: res, model and fit. res is a vector of predictions of the model evaluated at new.x, model is the model object used to estimate P(A = 1 | X) and fit is a function with argument new.x that returns the predictions of the model. See examples.

drl.v

A function taking arguments (pseudo, v, new.v). It trains a model estimating E(Y^1 - Y^0 | V) by regressing a pseudo-outcome pseudo onto v and returns a list of 3 elements: res, model and fit. res is a vector of predictions of the model evaluated at new.v, model is the model object used to estimate E(Y^1 - Y^0 | V) (after possibly model selection) and fit is a function with argument new.v that returns the predictions of the model. See examples. #' @param drl.x A function taking arguments (pseudo, x, new.x). It trains a model estimating E(Y^1 - Y^0 | X) by regressing a pseudo-outcome pseudo onto x and returns a list of 3 elements: res, model and fit. res is a vector of predictions of the model evaluated at new.x, model is the model object used to estimate E(Y^1 - Y^0 | X) (after possibly model selection) and fit is a function with argument new.v that returns the predictions of the model. See examples.

nsplits

An integer indicating the number of splits used for cross-validation. Ignored if foldid is specified.

foldid

An optional vector specifying fold assignments for cross-validation.

univariate_reg

A logical indicating whether to perform univariate regression for estimating the CATE as a function of each effect modifier separately (default: FALSE).

partial_dependence

A logical indicating whether to compute partial dependence plots (default: FALSE).

partially_linear

A logical indicating whether to compute partially linear approximations via Robinson's transformation (default: FALSE).

additive_approx

A logical indicating whether to compute the CATE assuming an additive structure (default: FALSE).

variable_importance

A logical indicating whether to compute variable importance measures (default: FALSE).

vimp_num_splits

An integer specifying the number of splits for variable importance computation (default: 1).

bw.stage2

A list of length equal to the number of effect modifiers considered, where each element if a vector of candidate bandwidths for second-stage regression of the pseudo-outcome onto the effect modifier that calculates either the univariate CATE or the Partial Dependence measure (default: NULL). It needs to be provided if univariate_reg or partial_dependence is set to TRUE.

sample.split.cond.dens

A logical indicating whether to do sample-splitting for conditional density estimation (default: FALSE).

cond.dens

A function

cate.w

A function

cate.not.j

A function

reg.basis.not.j

A function

pl.dfs

A list of length equal to the number of effect modifiers considered, where each element is a vector of candidate number of basis elements for the partially linear approximation computed via Robinson trick.

Value

A list containing the estimated CATE at v0 and per-fold estimates of the CATE at v0 for each learner.

References

Kennedy, EH. (2020). Optimal Doubly Robust Estimation of Heterogeneous Causal Effects. arXiv preprint arXiv:2004.14497.


matteobonvini/drl.cate documentation built on Nov. 10, 2024, 12:20 a.m.