tidyhte
provides tidy semantics for estimation of heterogeneous
treatment effects through the use of Kennedy’s (n.d.) doubly-robust
learner.
The goal of tidyhte
is to use a sort of “recipe” design. This should
(hopefully) make it extremely easy to scale an analysis of HTE from the
common single-outcome / single-moderator case to many outcomes and many
moderators. The configuration of tidyhte
should make it extremely easy
to perform the same analysis across many outcomes and for a wide-array
of moderators. It’s written to be fairly easy to extend to different
models and to add additional diagnostics and ways to output information
from a set of HTE estimates.
The best place to start for learning how to use tidyhte
are the
vignettes which runs through example analyses from start to finish:
vignette("experimental_analysis")
and
vignette("observational_analysis")
. There is also a writeup
summarizing the method and implementation in
vignette("methodological_details")
.
You will be able to install the released version of tidyhte from CRAN with:
install.packages("tidyhte")
But this does not yet exist. In the meantime, install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("ddimmery/tidyhte")
To set up a simple configuration, it’s straightforward to use the Recipe API:
library(tidyhte)
library(dplyr)
basic_config() %>%
add_propensity_score_model("SL.glmnet") %>%
add_outcome_model("SL.glmnet") %>%
add_moderator("Stratified", x1, x2) %>%
add_moderator("KernelSmooth", x3) %>%
add_vimp(sample_splitting = FALSE) -> hte_cfg
The basic_config
includes a number of defaults: it starts off the
SuperLearner ensembles for both treatment and outcome with linear models
("SL.glm"
)
data %>%
attach_config(hte_cfg) %>%
make_splits(userid, .num_splits = 12) %>%
produce_plugin_estimates(
outcome_variable,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(outcome_variable, treatment_variable) -> data
data %>%
estimate_QoI(covariate1, covariate2) -> results
To get information on estimate CATEs for a moderator not included previously would just require rerunning the final line:
data %>%
estimate_QoI(covariate3) -> results
Replicating this on a new outcome would be as simple as running the following, with no reconfiguration necessary.
data %>%
attach_config(hte_cfg) %>%
produce_plugin_estimates(
second_outcome_variable,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(second_outcome_variable, treatment_variable) %>%
estimate_QoI(covariate1, covariate2) -> results
This leads to the ability to easily chain together analyses across many outcomes in an easy way:
library("foreach")
data %>%
attach_config(hte_cfg) %>%
make_splits(userid, .num_splits = 12) -> data
foreach(outcome = list_of_outcomes, .combine = "bind_rows") %do% {
data %>%
produce_plugin_estimates(
outcome,
treatment_variable,
covariate1, covariate2, covariate3, covariate4, covariate5, covariate6
) %>%
construct_pseudo_outcomes(outcome, treatment_variable) %>%
estimate_QoI(covariate1, covariate2) %>%
mutate(outcome = rlang::as_string(outcome))
}
The function estimate_QoI
returns results in a tibble format which
makes it easy to manipulate or plot results.
There are two main ways to get help:
If you have a problem, feel free to open an issue on GitHub. Please try to provide a minimal reproducible example. If that isn’t possible, explain as clearly and simply why that is, along with all of the relevant debugging steps you’ve already taken.
Support for the package will also be provided in the Experimentation Community Discord:
You are welcome to come in and get support for your usage in the
tidyhte
channel. Keep in mind that everyone is volunteering their time
to help, so try to come prepared with the debugging steps you’ve already
taken.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.