In randomized, controlled treatment (RCT) studies, some subsets of the population may benefit from an intervention more than others. Some subsets may also be harmed. One can define these outcomes as a local average treatment effect in which the average treatment effect for the subset differs substantially from the average treatment effect for the all study subjects. For example, in a clinical study measuring the effect of a drug on a certain disease, it may be the drug may be especially effective for women of a certain age. Alternatively, such women may experience harmful side effects.
The aim of the subgroupTree
package is to identify subsets of a study subjects that display particularly large positive or large negative average treatment effects. The procedure searches for such subgroups though a greedy tree growing procedure. At each stage, the procedure recursively seeks the single predictor and split leading to the largest observed average treatment effect between the subsets defined by that split.
It is also desirable to conduct statistical tests for any local average treatment effect, typically for a null hypothesis that the average local treatment effect equals 0.0. In this setting, classical inferential procedures break down because the process requires adaptive search and selection. We implement a permutation procedure to provide conservative inference for any detected effects. Currently, this is implemented only for subgroups defined by a split on a single predictor based on a tree stump. Larger trees present computational challenges. Details can be found in Berk et al., Using Recursive Partitioning to Find and Estimate Heterogenous Treatment Effects In Randomized Clinical Trials Journal of Experimental Criminology, forthcoming, 2020.
We will illustrate these ideas through a simple simulation. The data frame
below contains four predictors z1
, x1
, z2
, x2
, a treatment indicator
variable treated
, and a binary response response
that is conditioned on each
of these variables. By construction, the the expected values of response
is
0.9 for treated individuals for which z1 %in% c('A', 'B') & x1 > 0.5
, and
is 0.5 otherwise. In other words, this data set contains a subset that
displays an average treatment effect of 0.4, and we hope to detect this subset
with our procedure.
set.seed(1234) n = 2000 df = data.frame(z1 = sample(c('A', 'B', 'C', 'D'), n, replace = TRUE), x1 = runif(n), z2 = sample(c('a', 'b', 'c', 'd'), n, replace = TRUE), x2 = runif(n), treated = sample(c(FALSE, TRUE), n, replace = TRUE)) high_ate = df$z1 %in% c('A', 'B') & df$x1 > 0.5 df$response = rbinom(n, 1, prob = ifelse(high_ate & df$treated, 0.9, 0.5))
The subgroup_tree
function implements our greedy search procedure and relies
heavily on the rpart
package. In the call below, we specify that we
intend to search for large treatment effects by specifying direction = 'max'
,
and we pass in two parameters that govern the tree-growing process. Specifically,
we search for trees of depth two, and we consider only splits of nodes that
contain 100 or more observations.
library(subgroupTree) max_tree = subgroup_tree(df$response, df$treated, df[,1:4], direction = 'max', maxdepth = 2, minbucket = 100) print(max_tree)
Each line of the output provides information about the structure of the tree
and the average treatment effect computed at each node. We can see from line 7
that the tree detects an average treatment effect of 0.422 among units for
which z1 == B & x1>=0.5326
. Alternatively, it can be
more convenient to view the same output in graphical form.
library(rpart.plot) rpart.plot(max_tree, extra=1)
For illustration, we also demonstrate the effect of changing direction = 'min'
.
In this case, we are directing the tree to search for subgroups with small
treatment effects (i.e. negative, but large in absolute value).
min_tree = subgroup_tree(df$response, df$treated, X = df[,1:4], direction = 'min', maxdepth = 1, minbucket = 100) print(min_tree)
In the tree above, we detect a treatment effect among individuals for which
x1 < 0.055
. Of course, by construction the population treatment effects
in this group is zero. We illustrate our inferential procedure to confirm this.
Each iteration of our permutation procedure shuffles the response and calculates
the largest possible treatment effect t-value that can result splitting the data
into two groups. At the moment, our implementation only handles single splits
of the data for computational tractability.
# sampled t-values perm_t_score = subgroup_perm_test(df$response, df$treated, df[,1:4], direction = 'min', ate = 0.0, minbucket = 100) # t-value for different in subgroup = (df$x1 < 0.055) t_score = with(df, t.test(response[treated & subgroup], response[!treated & subgroup], alternative = 'less')) # permutation p-value mean(perm_t_score <= t_score$statistic)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.