Prediction with a residual bias correction estimator

Share:

Description

This method combines the regression estimator with a residual bias correction for estimating a parametric ADRF.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
aipwee_est(Y,
           treat,
           covar_formula = ~ 1,
           covar_lin_formula = ~ 1,
           covar_sq_formula = ~ 1,
           data,
           e_treat_1 = NULL,
           e_treat_2 = NULL,
           e_treat_3 = NULL,
           e_treat_4 = NULL,
           degree = 1,
           wt = NULL,
           method = "same",
           spline_df = NULL,
           spline_const = 1,
           spline_linear = 1,
           spline_quad = 1)

Arguments

Y

is the the name of the outcome variable contained in data.

treat

is the name of the treatment variable contained in data.

covar_formula

is the formula to describe the covariates needed to estimate the constant term: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

covar_lin_formula

is the formula to describe the covariates needed to estimate the linear term, t: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

covar_sq_formula

is the formula to describe the covariates needed to estimate the quadratic term, t^2: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

data

is a dataframe containing Y, treat, and X.

e_treat_1

a vector, representing the conditional expectation of treat from T_mod.

e_treat_2

a vector, representing the conditional expectation of treat^2 from T_mod.

e_treat_3

a vector, representing the conditional expectation of treat^3 from T_mod.

e_treat_4

a vector, representing the conditional expectation of treat^4 from T_mod.

degree

is 1 for linear and 2 for quadratic outcome model.

wt

is weight used in lsfit for outcome regression. Default is wt = NULL.

method

is "same" if the same set of covariates are used to estimate the constant, linear, and/or quadratic term. If method = "different", then different sets of covariates can be used to estimate the constant, linear, and/or quadratic term. covar_lin_formula and covar_sq_formula must be specified if method = "different".

spline_df

degrees of freedom. The default, spline_df = NULL, corresponds to no knots.

spline_const

is the number of spline terms needed to estimate the constant term.

spline_linear

is the number of spline terms needed to estimate the linear term.

spline_quad

is the number of spline terms needed to estimate the quadratic term.

Details

This estimator bears a strong resemblance to general regression estimators in the survey literature, part of a more general class of calibration estimators (Deville and Sarndal, 1992). It is doubly robust, which means that it is consistent if either of the models is true (Scharfstein, Rotnitzky and Robins 1999). If the Y-model is correct, then the first term in the previous equation is unbiased for ΞΎ and the second term has mean zero even if the T-model is wrong. If the Y-model is incorrect, the first term is biased, but the second term gives a consistent estimate of (minus one times) the bias from the Y-model if the T-model is correct.

This function is a doubly-robust estimator that fits an outcome regression model with a bias correction term. For details see Schafer and Galagate (2015).

Value

aipwee_est returns an object of class "causaldrf_lsfit", a list that contains the following components:

param

parameter estimates for a add_spl fit.

t_mod

the result of the treatment model fit.

out_mod

the result of the outcome model fit.

call

the matched call.

References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Schafer, Joseph L, Kang, Joseph (2008). Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychological methods, 13.4, 279.

Robins, James M and Rotnitzky, Andrea (1995). Semiparametric efficiency in multivariate regression models with missing data Journal of the American Statistical Association, 90.429, 122–129.

Scharfstein, Daniel O and Rotnitzky, Andrea and Robins, James M (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models Journal of the American Statistical Association, 94.448, 1096–1120.

Deville, Jean-Claude and Sarndal, Carl-Erik (1992). Calibration estimators in survey sampling Journal of the American Statistical Association, 87.418, 376–380.

See Also

iptw_est, ismw_est, reg_est, wtrg_est, ##' etc. for other estimates.

t_mod, overlap_fun to prepare the data for use in the different estimates.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
## Example from Schafer (2015).

example_data <- sim_data


t_mod_list <- t_mod(treat = T,
              treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
              data = example_data,
              treat_mod = "Normal")

cond_exp_data <- t_mod_list$T_data
full_data <- cbind(example_data, cond_exp_data)

aipwee_list <- aipwee_est(Y = Y,
                         treat = T,
                         covar_formula = ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
                         covar_lin_formula = ~ 1,
                         covar_sq_formula = ~ 1,
                         data = example_data,
                         e_treat_1 = full_data$est_treat,
                         e_treat_2 = full_data$est_treat_sq,
                         e_treat_3 = full_data$est_treat_cube,
                         e_treat_4 = full_data$est_treat_quartic,
                         degree = 1,
                         wt = NULL,
                         method = "same",
                         spline_df = NULL,
                         spline_const = 1,
                         spline_linear = 1,
                         spline_quad = 1)

sample_index <- sample(1:1000, 100)

plot(example_data$T[sample_index],
      example_data$Y[sample_index],
      xlab = "T",
      ylab = "Y",
      main = "aipwee estimate")

abline(aipwee_list$param[1],
        aipwee_list$param[2],
        lty = 2,
        lwd = 2,
        col = "blue")

legend('bottomright',
        "aipwee estimate",
        lty = 2,
        lwd = 2,
        col = "blue",
        bty='Y',
        cex=1)

rm(example_data, t_mod_list, cond_exp_data, full_data, aipwee_list, sample_index)