prop_spline_est: The propensity-spline prediction estimator

View source: R/prop_spline_est.R

prop_spline_estR Documentation

The propensity-spline prediction estimator

Description

This method estimates the linear or quadratic parameters of the ADRF by estimating a least-squares fit on the basis functions which are composed of combinations of the covariates, propensity spline basis, and treatment values.

Usage

prop_spline_est(Y,
                treat,
                covar_formula = ~ 1,
                covar_lin_formula = ~ 1,
                covar_sq_formula = ~ 1,
                data,
                e_treat_1 = NULL,
                degree = 1,
                wt = NULL,
                method = "same",
                spline_df = NULL,
                spline_const = 1,
                spline_linear = 1,
                spline_quad = 1)

Arguments

Y

is the the name of the outcome variable contained in data.

treat

is the name of the treatment variable contained in data.

covar_formula

is the formula to describe the covariates needed to estimate the constant term: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

covar_lin_formula

is the formula to describe the covariates needed to estimate the linear term, t: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

covar_sq_formula

is the formula to describe the covariates needed to estimate the quadratic term, t^2: ~ X.1 + ..... Can include higher order terms or interactions. i.e. ~ X.1 + I(X.1^2) + X.1 * X.2 + ..... Don't forget the tilde before listing the covariates.

data

is a dataframe containing Y, treat, and X.

e_treat_1

a vector, representing the conditional expectation of treat from T_mod. Or, plug in gps estimates here to create splines from the gps values.

degree

is 1 for linear and 2 for quadratic outcome model.

wt

is weight used in lsfit for outcome regression. Default is wt = NULL.

method

is "same" if the same set of covariates are used to estimate the constant, linear, and/or quadratic term with no spline terms. If method = "different", then different sets of covariates can be used to estimate the constant, linear, and/or quadratic term. To use spline terms, it is necessary to set method = "different". covar_lin_formula and covar_sq_formula must be specified if method = "different".

spline_df

degrees of freedom. The default, spline_df = NULL, corresponds to no knots.

spline_const

is the number of spline terms to include when estimating the constant term.

spline_linear

is the number of spline terms to include when estimating the linear term.

spline_quad

is the number of spline terms to include when estimating the quadratic term.

Details

This function estimates the ADRF by the method described in Schafer and Galagate (2015), that fits an outcome model using a function of the covariates and spline basis functions derived from the propensity function component.

Value

prop_spline_est returns an object of class "causaldrf_lsfit", a list that contains the following components:

param

the estimated parameters.

out_mod

the result of the outcome model fit using lsfit.

call

the matched call.

References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Little, Roderick and An, Hyonggin (2004). ROBUST LIKELIHOOD-BASED ANALYSIS OF MULTIVARIATE DATA WITH MISSING VALUES. Statistica Sinica. 14: 949–968.

Schafer, Joseph L, Kang, Joseph (2008). Average causal effects from nonrandomized studies: a practical guide and simulated example. Psychological methods, 13.4, 279.

See Also

iptw_est, ismw_est, reg_est, aipwee_est, wtrg_est, etc. for other estimates.

t_mod, overlap_fun to prepare the data for use in the different estimates.

Examples

## Example from Schafer (2015).

example_data <- sim_data

t_mod_list <- t_mod(treat = T,
              treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
              data = example_data,
              treat_mod = "Normal")

cond_exp_data <- t_mod_list$T_data
full_data <- cbind(example_data, cond_exp_data)

prop_spline_list <- prop_spline_est(Y = Y,
                            treat = T,
                            covar_formula = ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
                            covar_lin_formula = ~ 1,
                            covar_sq_formula = ~ 1,
                            data = example_data,
                            e_treat_1 = full_data$est_treat,
                            degree = 1,
                            wt = NULL,
                            method = "different",
                            spline_df = 5,
                            spline_const = 4,
                            spline_linear = 4,
                            spline_quad = 4)

sample_index <- sample(1:1000, 100)

plot(example_data$T[sample_index],
      example_data$Y[sample_index],
      xlab = "T",
      ylab = "Y",
      main = "propensity spline estimate")

abline(prop_spline_list$param[1],
        prop_spline_list$param[2],
        lty = 2,
        col = "blue",
        lwd = 2)

legend('bottomright',
        "propensity spline estimate",
        lty = 2,
        bty = 'Y',
        cex = 1,
        col = "blue",
        lwd = 2)

rm(example_data, prop_spline_list, sample_index)

causaldrf documentation built on Sept. 30, 2022, 1:07 a.m.