hi_est | R Documentation |
This function estimates the GPS function and estimates the ADRF. The GPS score is based on different treatment models. The treatment is linearly related to Xs.
hi_est(Y, treat, treat_formula, outcome_formula, data, grid_val, treat_mod, link_function, ...)
Y |
is the the name of the outcome variable contained in |
treat |
is the name of the treatment variable contained in
|
treat_formula |
an object of class "formula" (or one that can be
coerced to that class) that regresses |
outcome_formula |
is the formula used for fitting the outcome surface. gps is one of the independent variables to use in the outcome_formula. ie.
or a variation
of this. Use |
data |
is a dataframe containing |
grid_val |
contains the treatment values to be evaluated. |
treat_mod |
a description of the error distribution to be used in the
model for treatment. Options include: |
link_function |
For |
... |
additional arguments to be passed to the outcome lm() function. |
Hirano (2004) (HI) introduced this imputation-type method that includes a GPS component. The idea is to fit a parametric observable (outcome) model, which includes the estimated GPS as a covariate, to impute missing potential outcomes.
The method requires several steps. First, a model is used to relate treatment to the recorded covariates. For example, T_i|\textbf{X}_i \sim \mathcal{N}(\textbf{X}_i^T \boldsymbol{β}, σ^2) and then estimate the \boldsymbol{β} parameters. Next, the GPS for each unit is estimated
\hat{R}_i(t) = \frac{1}{√{2 π \hat{σ}^2} } e^{-\frac{(t - \textbf{X}_i^T \boldsymbol{\hat{β}})^2}{2 \hat{σ}^2}}
These GPS estimates are used in the outcome or observable model. The outcome is modeled as a function of T_i and \hat{R}_i parametrically. For example,
E[Y_i | T_i, R_i] = α_0 + α_1 T_i + α_2 T_i^2 + α_3 \hat{R}_i + α_4 \hat{R}_i^2 + α_5 \hat{R}_i\cdot T_i
After collecting the estimated parameters in the outcome and treatment models, plug-in the treatment values into the model to estimate the missing potential outcomes of each individual at that treatment level. For example, if we plug in T_i = t into the estimated models, then each unit will have a potential outcome estimated at treatment level T_i= t.
\hat{Y}_i(t) = \hat{α}_0 + \hat{α}_1 t + \hat{α}_2 t^2 + \hat{α}_3 \hat{R}_i(t) + \hat{α}_4 \hat{R}_i^2(t) + \hat{α}_5 \hat{R}_i(t) \cdot t
The next step is to aggregate these estimated potential outcomes to get an average treatment effect at dose level T_i = t. The mean outcome at dose-level T_i = t is given by:
\hat{μ}(t) = \frac{1}{N}∑_i^N \hat{α}_0 + \hat{α}_1 t + \hat{α}_2 t^2 + \hat{α}_3 \hat{R}_i(t) + \hat{α}_4 \hat{R^2}_i(t) + \hat{α}_5 \hat{R}_i(t) \cdot t
Different treatment levels are plugged into the previous equation to estimate the missing potential outcomes. If many t values are evaluated, then it is possible to trace out an ADRF.
hi_est
returns an object of class "causaldrf",
a list that contains the following components:
param |
parameter estimates for a hi fit. |
t_mod |
the result of the treatment model fit. |
out_mod |
the result of the outcome model fit. |
call |
the matched call. |
Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.
Hirano, Keisuke, Imbens, Guido W (2004). The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives.
nw_est
, iw_est
, hi_est
, gam_est
,
add_spl_est
,
bart_est
, etc. for other estimates.
t_mod
, overlap_fun
to prepare the data
for use in the different estimates.
## Example from Schafer (2015). example_data <- sim_data hi_list <- hi_est(Y = Y, treat = T, treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8, outcome_formula = Y ~ T + I(T^2) + gps + I(gps^2) + T * gps, data = example_data, grid_val = seq(8, 16, by = 1), treat_mod = "Normal") sample_index <- sample(1:1000, 100) plot(example_data$T[sample_index], example_data$Y[sample_index], xlab = "T", ylab = "Y", main = "hi estimate") lines(seq(8, 16, by = 1), hi_list$param, lty = 2, lwd = 2, col = "blue") legend('bottomright', "hi estimate", lty=2, lwd = 2, col = "blue", bty='Y', cex=1) rm(example_data, hi_list, sample_index) ## Example from van der Wal, Willem M., and Ronald B. Geskus. (2011) #Simulate data with continuous confounder and outcome, binomial exposure. #Marginal causal effect of exposure on outcome: 10. n <- 1000 simdat <- data.frame(l = rnorm(n, 10, 5)) a.lin <- simdat$l - 10 pa <- exp(a.lin)/(1 + exp(a.lin)) simdat$a <- rbinom(n, 1, prob = pa) simdat$y <- 10*simdat$a + 0.5*simdat$l + rnorm(n, -10, 5) simdat[1:5,] temp_hi <- hi_est(Y = y, treat = a, treat_formula = a ~ l, outcome_formula = y ~ gps, data = simdat, grid_val = c(0, 1), treat_mod = "Binomial", link_function = "logit") temp_hi[[1]] # estimated coefficients
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.