# hi_est: The Hirano and Imbens estimator In causaldrf: Tools for Estimating Causal Dose Response Functions

## Description

This function estimates the GPS function and estimates the ADRF. The GPS score is based on different treatment models. The treatment is linearly related to Xs.

## Usage

 1 2 3 4 5 6 7 8 9 hi_est(Y, treat, treat_formula, outcome_formula, data, grid_val, treat_mod, link_function, ...) 

## Arguments

 Y is the the name of the outcome variable contained in data. treat is the name of the treatment variable contained in data. treat_formula an object of class "formula" (or one that can be coerced to that class) that regresses treat on a linear combination of X: a symbolic description of the model to be fitted. outcome_formula is the formula used for fitting the outcome surface. gps is one of the independent variables to use in the outcome_formula. ie. Y ~ treat+ I(treat^2) + gps + I(gps^2) + treat * gps or a variation of this. Use gps as the name of the variable representing the gps in outcome_formula. data is a dataframe containing Y, treat, and X. grid_val contains the treatment values to be evaluated. treat_mod a description of the error distribution to be used in the model for treatment. Options include: "Normal" for normal model, "LogNormal" for lognormal model, "Sqrt" for square-root transformation to a normal treatment, "Poisson" for Poisson model, "NegBinom" for negative binomial model, "Gamma" for gamma model, "Binomial" for binomial model. link_function For treat_mod = "Gamma" (fitted using glm) alternatives are "log" or "inverse". For treat_mod = "Binomial" (fitted using glm) alternatives are "logit", "probit", "cauchit", "log" and "cloglog". ... additional arguments to be passed to the outcome lm() function.

## Details

Hirano (2004) (HI) introduced this imputation-type method that includes a GPS component. The idea is to fit a parametric observable (outcome) model, which includes the estimated GPS as a covariate, to impute missing potential outcomes.

The method requires several steps. First, a model is used to relate treatment to the recorded covariates. For example, T_i|\textbf{X}_i \sim \mathcal{N}(\textbf{X}_i^T \boldsymbol{β}, σ^2) and then estimate the \boldsymbol{β} parameters. Next, the GPS for each unit is estimated

\hat{R}_i(t) = \frac{1}{√{2 π \hat{σ}^2} } e^{-\frac{(t - \textbf{X}_i^T \boldsymbol{\hat{β}})^2}{2 \hat{σ}^2}}

These GPS estimates are used in the outcome or observable model. The outcome is modeled as a function of T_i and \hat{R}_i parametrically. For example,

E[Y_i | T_i, R_i] = α_0 + α_1 T_i + α_2 T_i^2 + α_3 \hat{R}_i + α_4 \hat{R}_i^2 + α_5 \hat{R}_i\cdot T_i

After collecting the estimated parameters in the outcome and treatment models, plug-in the treatment values into the model to estimate the missing potential outcomes of each individual at that treatment level. For example, if we plug in T_i = t into the estimated models, then each unit will have a potential outcome estimated at treatment level T_i= t.

\hat{Y}_i(t) = \hat{α}_0 + \hat{α}_1 t + \hat{α}_2 t^2 + \hat{α}_3 \hat{R}_i(t) + \hat{α}_4 \hat{R}_i^2(t) + \hat{α}_5 \hat{R}_i(t) \cdot t

The next step is to aggregate these estimated potential outcomes to get an average treatment effect at dose level T_i = t. The mean outcome at dose-level T_i = t is given by:

\hat{μ}(t) = \frac{1}{N}∑_i^N \hat{α}_0 + \hat{α}_1 t + \hat{α}_2 t^2 + \hat{α}_3 \hat{R}_i(t) + \hat{α}_4 \hat{R^2}_i(t) + \hat{α}_5 \hat{R}_i(t) \cdot t

Different treatment levels are plugged into the previous equation to estimate the missing potential outcomes. If many t values are evaluated, then it is possible to trace out an ADRF.

## Value

hi_est returns an object of class "causaldrf", a list that contains the following components:

 param parameter estimates for a hi fit. t_mod the result of the treatment model fit. out_mod the result of the outcome model fit. call the matched call.

## References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Hirano, Keisuke, Imbens, Guido W (2004). The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives.

nw_est, iw_est, hi_est, gam_est, add_spl_est, bart_est, etc. for other estimates.

t_mod, overlap_fun to prepare the data for use in the different estimates.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 ## Example from Schafer (2015). example_data <- sim_data hi_list <- hi_est(Y = Y, treat = T, treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8, outcome_formula = Y ~ T + I(T^2) + gps + I(gps^2) + T * gps, data = example_data, grid_val = seq(8, 16, by = 1), treat_mod = "Normal") sample_index <- sample(1:1000, 100) plot(example_data$T[sample_index], example_data$Y[sample_index], xlab = "T", ylab = "Y", main = "hi estimate") lines(seq(8, 16, by = 1), hi_list$param, lty = 2, lwd = 2, col = "blue") legend('bottomright', "hi estimate", lty=2, lwd = 2, col = "blue", bty='Y', cex=1) rm(example_data, hi_list, sample_index) ## Example from van der Wal, Willem M., and Ronald B. Geskus. (2011) #Simulate data with continuous confounder and outcome, binomial exposure. #Marginal causal effect of exposure on outcome: 10. n <- 1000 simdat <- data.frame(l = rnorm(n, 10, 5)) a.lin <- simdat$l - 10 pa <- exp(a.lin)/(1 + exp(a.lin)) simdat$a <- rbinom(n, 1, prob = pa) simdat$y <- 10*simdat$a + 0.5*simdat$l + rnorm(n, -10, 5) simdat[1:5,] temp_hi <- hi_est(Y = y, treat = a, treat_formula = a ~ l, outcome_formula = y ~ gps, data = simdat, grid_val = c(0, 1), treat_mod = "Binomial", link_function = "logit") temp_hi[[1]] # estimated coefficients 

### Example output

          l a          y
1  5.115801 0  -9.614258
2  4.491812 0 -13.284944
3 13.194585 1   2.881462
4 13.327736 1   3.059533
5 19.764903 1  16.227761
[1] -7.442739  7.318356


causaldrf documentation built on May 2, 2019, 5:14 a.m.