prop_spline_est: The propensity-spline prediction estimator
In causaldrf: Estimating Causal Dose Response Functions

prop_spline_est

R Documentation

The propensity-spline prediction estimator

Description

This method estimates the linear or quadratic parameters of the ADRF by estimating a least-squares fit on the basis functions which are composed of combinations of the covariates, propensity spline basis, and treatment values.

Usage

prop_spline_est(Y,
                treat,
                covar_formula = ~ 1,
                covar_lin_formula = ~ 1,
                covar_sq_formula = ~ 1,
                data,
                e_treat_1 = NULL,
                degree = 1,
                wt = NULL,
                method = "same",
                spline_df = NULL,
                spline_const = 1,
                spline_linear = 1,
                spline_quad = 1)

Arguments

`Y`	is the the name of the outcome variable contained in `data`.
`treat`	is the name of the treatment variable contained in `data`.
`covar_formula`	is the formula to describe the covariates needed to estimate the constant term: `~ X.1 + ....`. Can include higher order terms or interactions. i.e. `~ X.1 + I(X.1^2) + X.1 * X.2 + ....`. Don't forget the tilde before listing the covariates.
`covar_lin_formula`	is the formula to describe the covariates needed to estimate the linear term, t: `~ X.1 + ....`. Can include higher order terms or interactions. i.e. `~ X.1 + I(X.1^2) + X.1 * X.2 + ....`. Don't forget the tilde before listing the covariates.
`covar_sq_formula`	is the formula to describe the covariates needed to estimate the quadratic term, t^2: `~ X.1 + ....`. Can include higher order terms or interactions. i.e. `~ X.1 + I(X.1^2) + X.1 * X.2 + ....`. Don't forget the tilde before listing the covariates.
`data`	is a dataframe containing `Y`, `treat`, and `X`.
`e_treat_1`	a vector, representing the conditional expectation of `treat` from `T_mod`. Or, plug in gps estimates here to create splines from the gps values.
`degree`	is 1 for linear and 2 for quadratic outcome model.
`wt`	is weight used in lsfit for outcome regression. Default is wt = NULL.
`method`	is "same" if the same set of covariates are used to estimate the constant, linear, and/or quadratic term with no spline terms. If method = "different", then different sets of covariates can be used to estimate the constant, linear, and/or quadratic term. To use spline terms, it is necessary to set method = "different". covar_lin_formula and covar_sq_formula must be specified if method = "different".
`spline_df`	degrees of freedom. The default, spline_df = NULL, corresponds to no knots.
`spline_const`	is the number of spline terms to include when estimating the constant term.
`spline_linear`	is the number of spline terms to include when estimating the linear term.
`spline_quad`	is the number of spline terms to include when estimating the quadratic term.

Details

This function estimates the ADRF by the method described in Schafer and Galagate (2015), that fits an outcome model using a function of the covariates and spline basis functions derived from the propensity function component.

Value

prop_spline_est returns an object of class "causaldrf_lsfit", a list that contains the following components:

`param`	the estimated parameters.
`out_mod`	the result of the outcome model fit using lsfit.
`call`	the matched call.

References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Little, Roderick and An, Hyonggin (2004). ROBUST LIKELIHOOD-BASED ANALYSIS OF MULTIVARIATE DATA WITH MISSING VALUES. Statistica Sinica. 14: 949–968.

Schafer, Joseph L, Kang, Joseph (2008). Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychological methods, 13.4, 279.

Examples

## Example from Schafer (2015).

example_data <- sim_data

t_mod_list <- t_mod(treat = T,
              treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
              data = example_data,
              treat_mod = "Normal")

cond_exp_data <- t_mod_list$T_data
full_data <- cbind(example_data, cond_exp_data)

prop_spline_list <- prop_spline_est(Y = Y,
                            treat = T,
                            covar_formula = ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
                            covar_lin_formula = ~ 1,
                            covar_sq_formula = ~ 1,
                            data = example_data,
                            e_treat_1 = full_data$est_treat,
                            degree = 1,
                            wt = NULL,
                            method = "different",
                            spline_df = 5,
                            spline_const = 4,
                            spline_linear = 4,
                            spline_quad = 4)

sample_index <- sample(1:1000, 100)

plot(example_data$T[sample_index],
      example_data$Y[sample_index],
      xlab = "T",
      ylab = "Y",
      main = "propensity spline estimate")

abline(prop_spline_list$param[1],
        prop_spline_list$param[2],
        lty = 2,
        col = "blue",
        lwd = 2)

legend('bottomright',
        "propensity spline estimate",
        lty = 2,
        bty = 'Y',
        cex = 1,
        col = "blue",
        lwd = 2)

rm(example_data, prop_spline_list, sample_index)

causaldrf documentation built on Sept. 30, 2022, 1:07 a.m.