sg.SL: SuperLearner for Estimating the Conditional Average Treatment...
In alexluedtke12/sg: Targeted Learning for Subgroup Analyses

View source: R/SL.R

sg.SL

R Documentation

SuperLearner for Estimating the Conditional Average Treatment Effect

Description

This function estimates the average additive effect of assigning treatments of interest conditional on baseline covariates, compared to assigning treatment at random according to the probabilities seen in the observed population.

Usage

sg.SL(W, A, Y, SL.library, Delta = rep(1,length(A)), OR.SL.library = SL.library,
  prop.SL.library = SL.library, missingness.SL.library = SL.library, txs = c(0, 1), g0 = NULL,
  Q0 = NULL, family = binomial(), num.SL.folds = 10, num.SL.rep = 5,
  SL.method = "method.NNLS2", id = NULL, obsWeights = NULL,
  stratifyCV = FALSE, lib.ests = FALSE, ...)

Arguments

`W`	data frame with observations in the rows and baseline covariates used to form the subgroup in columns.
`A`	numeric treatment vector. Treatments of interest specified using the `txs` argument.
`Y`	real-valued outcome.
`SL.library`	SuperLearner library (see documentation for `SuperLearner` in the corresponding package) used to estimate the conditional average treatment effect functions.
`Delta`	Vector of the same length as `Y`. An entry should equal 1 if the corresponding entry in `Y` is observed, and should equal 0 if the corresponding entry in `Y` is to be treated as missing.
`OR.SL.library`	SuperLearner library (see documentation for `SuperLearner` in the corresponding package) used to estimate the outcome regressions.
`prop.SL.library`	SuperLearner library (see documentation for `SuperLearner` in the corresponding package) used to estimate the propensity scores.
`missingness.SL.library`	SuperLearner library (see documentation for `SuperLearner` in the corresponding package) used to estimate the probability of having a missing outcome given treatment and covariates.
`txs`	A vector indicating the two or more treatments of interest in A that will be used for the treatment assignment problem. The treatments in `A` may be a superset of those in in txs.
`g0`	if known (as in a randomized controlled trial), a matrix of probabilities of receiving the treatment corresponding to entry `k` in `txs` given covariates in the `k`th column. Rows correspond to individuals with (`W`,`A`,`Y`) observed. If `NULL`, `SuperLearner` will be used to estimate these probabilities.
`Q0`	a user-supplied matrix of estimates of the mean outcome of `Y` conditional on `A` and `X`. The matrix should have `n=nrow(W)` rows and `length(txs)` columns, where row `j` and column `k` contain the estimated outcome regression for the covariate level of individual `j` at treatment level `txs[k]`.
`family`	`binomial()` if outcome bounded in [0,1], or `gaussian()` otherwise. See `Details`.
`num.SL.folds`	number of folds to use in SuperLearner.
`num.SL.rep`	final output is an average of num.SL.rep super-learner fits (repetition ensures minimal reliance on initial choice of folds)
`SL.method`	method that the SuperLearner function uses to select a convex combination of learners
`id`	optional cluster identification variable
`obsWeights`	observation weights
`stratifyCV`	stratify validation folds by event counts (does this for estimation of outcome regression, treatment mechanism, and conditional average treatment effect function). Useful for rare outcomes
`lib.ests`	Also return the candidate optimal rule estimates in the super-learner library

Details

If outcome is bounded in [0,1], then this functions respects that fact when estimating the outcome regression but not when estimating the conditional average treatment effect using the double robust loss presented in the below cited paper.

Value

a list containing

`est`	Vector containing an estimate of the conditional average treatment effect function for each individual in the data set (conditional on the covariate strata they belong to). Here the conditional average treatment effect is defined as the difference in conditional mean outcome if receiving the treatment in `txs` versus the expected outcome for a treatment randomly drawn according to the observed distribution (conditional on covariates).
`SL.cate.fun`	A function that takes as input covariates (as a matrix) and returns a matrix of conditional average treatment effects (estimated by SuperLearner) with rows corresponding to the different covariate values in the rows of W and columns corresponding to the different treatments.
`SL`	a list of lists of `SuperLearner` objects used to generate these estimates. Each entry in the outer list corresponds to a treatment in txs. Each entry in the inner list corresponds to one of the num.SL.rep repetitions.

if lib.ests is set to true, then this list also contains:

`lib.ests`	a list with entries corresponding to learners in `SL.library`. Each entry is of the same format as `est`.
`lib.cate.fun`	A function that takes as input covariates and returns a list with entries corresponding to learner in `SL.library`. Each entry is of the same format as the output of SL.cate.fun.

References

A. R. Luedtke and M. J. van der Laan, “Super-learning of an optimal dynamic treatment rule,” International Journal of Biostatistics (to appear), 2014.

Examples

# SuperLearner library
SL.library = c('SL.mean','SL.glm')

# simulated data
Qbar = function(a,w){plogis(a*w$W1)}
n = 1000
W = data.frame(W1=rnorm(n),W2=rnorm(n),W3=rnorm(n),W4=rnorm(n))
A = rbinom(n,1,1/2)
Y = rbinom(n,1,Qbar(A,W))
txs = c(0,1)

# sg.SL fit
out = sg.SL(W,A,Y,SL.library=SL.library,family=binomial())

# CATE estimate
cate.est = out$est
plot(W$W1,cate.est[,2])

# can also call predict to get predictions
predict(out,data.frame(W1=0,W2=0,W3=0,W4=0))

# compare to the truth
EYw = 0.5*Qbar(0,W)+0.5*Qbar(1,W)
cate.truth1 = Qbar(1,W) - EYw
plot(cate.est[,2],cate.truth1)

alexluedtke12/sg documentation built on May 24, 2023, 6:36 a.m.