knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
First, load the clusteredinterference
package
library(clusteredinterference)
Now load a quick data example that's included in the package
data("toy_data")
Estimation is implemented with the policyFX()
function:
suppressWarnings(RNGversion("3.5.0")) ## For backwards compatibility set.seed(1113) causal_fx <- policyFX( data = toy_data, formula = Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID, alphas = c(.3, .5), k_samps = 1 )
The policyFX()
function outputs a "policyFX"
object, which works well with a few methods, including:
summary(causal_fx)
data
A data.frame
. At present, tibble
s are coerced back to standard data.frame
s. I also recommend against using factor
s in the columns.
alphas
A numeric vector of probabilities corresponding the the policies of interest. Each must be between 0 and 1.
k_samps
The user must specify the number of sum-sampled vectors for estimating the counterfactual probabilities (nuisance parameters). It is recommended to choose k_samps <=5
. To avoid the sub-sampling approximation and use all possible vectors, set k_samps=0
.
formula
The formula may be the trickiest, and it has plenty of information. It provides:
outcome | treatment ~ predictors and random intercept | clustering specification
Note that the middle section is passed to glmer()
to fit the mixed effects model, so this is how to specify the modeling formula.
Treatment ~ Age + Distance + (1 | Cluster_ID)
See below for the model output.
verbose = FALSE
root_options = NULL
This is for rootSolve::multiroot()
used in the point estimation procedure. E.g., this will be passed in:
root_options = list(atol = 1e-7)
nAGQ=2
This is for lme4::glmer()
. The default in glmer()
is nAGQ=1
, which indicates a Laplace approximation to the log-likelihood. Instead, in this package the default is nAGQ=2
, which indicates that n=2
Adaptive Gaussian Quadrature points will be used. This is slightly slower but is a more accurate calculation. In limited testing, it seems that nAGQ=2
was almost as accurate as higher values, so 2 was chosen as the default. See their documentation for more details.
return_matrices = FALSE
If TRUE
, this will return the "bread" and "meat" matrices in the variance calculations. The default is FALSE
as these matrices can be quite large.
In the event you're only interested in a subset of contrasts, you can pass a customized grid into the function.
my_grid <- makeTargetGrid(alphas = (3:8)/20, small_grid = TRUE) head(my_grid)
This can be particularly useful for plotting, as you can "turn off" the variance estimates
my_grid$estVar <- FALSE
This is available through the dots argument. Note that when supplying a custom target_grid
, it's not necessary to specify the alphas
argument, as that is taken directly from target_grid
.
causal_fx2 <- policyFX( data = toy_data, formula = Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID, # alphas = c(.3, .5), target_grid = my_grid, k_samps = 5, verbose = FALSE, root_options = list(atol=1e-4) ) print(causal_fx, nrows = 9)
The tidy dataframe estimates
can be easily used for plotting:
plotdat <- causal_fx2$estimates[causal_fx2$estimates$estimand_type=="mu",] plot(x = plotdat$alpha1, y = plotdat$estimate, main = "Estimated Population Means")
As mentioned above, the treatment model is specified via the formula
argument. For example, compare:
# Returns the specified formula, coerced to a Formula object causal_fx$formula # causal_fx$model is a glmerMod S4 object causal_fx$model@call lme4::getME(causal_fx$model, c("beta", "theta"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.