lmw_est: Estimate a treatment effect from a linear model

View source: R/lmw_est.R

lmw_estR Documentation

Estimate a treatment effect from a linear model

Description

lmw_est() fits the outcome regression corresponding to the model used to compute the weights in the supplied lmw object and returns the model coefficients and their covariance matrix. Use summary.lmw_est() to compute and view the treatment effect and potential outcome mean estimates and their standard errors.

Usage

lmw_est(x, ...)

## S3 method for class 'lmw'
lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...)

## S3 method for class 'lmw_aipw'
lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...)

## S3 method for class 'lmw_iv'
lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...)

Arguments

x

an lmw or lmw_iv object; the output of a call to lmw() or lmw_iv().

...

other arguments passed to sandwich::vcovHC() or sandwich::vcovCL().

outcome

the name of the outcome variable. Can be supplied as a string containing the name of the outcome variable or as the outcome variable itself. If not supplied, the outcome variable in the formula supplied to lmw() or lmw_iv(), if any, will be used.

data

an optional data frame containing the outcome variable named in outcome and the cluster variable(s) when cluster is supplied as a formula.

robust

whether to compute the robust covariance matrix for the model coefficients. Allowable values include those allowed for the type argument of sandwich::vcovHC() or sandwich::vcovCL() when cluster is specified. Can also be specified as TRUE (the default), which means "HC3" or "HC1" when cluster is specified, or FALSE, which means "const" (i.e., the standard non-robust covariance). When cluster is specified, robust will be set to TRUE if FALSE. When AIPW is used, robust is ignored; the HC0 robust covariance matrix is used.

cluster

the clustering variable(s) for computing a cluster-robust covariance matrix. See sandwich::vcovCL(). If supplied as a formula, the clustering variables must be present in the original dataset used to compute the weights or data. When AIPW is used, cluster is ignored.

Details

lmw_est() uses lm.fit() or lm.wfit() to fit the outcome regression model (and first stage model for lmw_iv objects) and returns the output of these functions augmented with other components related to the estimation of the weights. Unlike with ⁠lm.[w]fit()⁠, the covariance matrix of the parameter estimates is also included in the output.

For lmw objects, the model fit is that supplied to the formula input to lmw() except that it is fit in a dataset appropriately centered to ensure the estimand corresponds with the one requested. When method = "MRI" in the call to lmw(), the model is fit as an interaction between the treatment and all the (centered) terms in the model formula. The results will be similar to those from using lm() on this model and supplied data except that the covariates are centered beforehand. The product of the sampling weights and base weights supplied to lmw(), if any, will be supplied to lm.wfit() to fit the model using weighted least squares.

For lmw_aipw objects, the model is fit as above except that base weights are not included in the model fitting and are instead used to compute additional augmentation terms that are added to the estimated potential outcome means from the outcome regression. The variance-covariance matrix is computed using M-estimation; this corresponds to the HC0 robust covariance matrix for the model parameters with the base weights treated as fixed, which yields conservative standard errors for the ATE. Inference is only approximate for the ATT and ATC.

For lmw_iv objects, the first stage model is constructed by removing the treatment from the supplied model formula, adding the instrumental variable as a main effect, and using the treatment variable as the outcome. For the second stage (reduced form) model, the fitted values of the treatment from the first stage model are used in place of the treatment in the outcome model. The results are similar to those from using ivreg::ivreg(), and the coefficients estimates will be the same except for the intercept due to the centering of covariates.

Although some coefficients in the model may be interpretable as treatment effect estimates, summary.lmw_est() should be used to view and extract the treatment effect and potential outcome mean estimates, standard errors, and other model statistics. The output of lmw_est() should rarely be used except to be supplied to summary().

Value

An lmw_est object with the following components:

coefficients, residuals, fitted.values, effects, weights, rank, df.residual, qr

for lmw objects, the output of the lm.fit() or lm.wfit() call used to fit the outcome model. For lmw_iv objects, the output of the lm.fit() or lm.wfit() call used to fit the the second stage model, with residuals corresponding to the residuals computed when substituting the true treatment variable in place of the fitted treatment values in the model.

model.matrix

the model matrix (supplied to the x argument of lm.fit).

vcov

the estimated covariance matrix of the parameter estimates as produced by sandwich::vcovHC() or sandwich::vcovCL().

lmw.weights

the implied regression weights computed by lmw_est().

call

the call to lmw_est().

estimand

the requested estimand.

focal

the focal treatment level when estimand is "ATT" or "ATC".

method

the method used to estimate the weights ("URI" or "MRI").

robust

the type standard error used.

outcome

the name of the outcome variable.

treat_levels

the levels of the treatment.

When AIPW is used, the object will be of class lmw_est_aipw, which inherits from lmw_est, and contains the additional components:

coef_aipw

the model-predicted potential outcome means (mu) and the augmentation terms (aug).

vcov_aipw

the covariance matrix of the quantities in coef_aipw.

When weights are included in the estimation (i.e., base.weights or s.weights supplied to lmw() or lmw_iv()), any units will weights equal to zero will be removed from the data prior to model fitting.

Methods exist for lmw_est objects for model.matrix(), vcov(), hatvalues(), sandwich::bread(), and sandwich::estfun(), all of which are used internally to compute the parameter estimate covariance matrix. The first two simply extract the corresponding component from the lmw_est object and the last three imitate the corresponding methods for lm objects (or ivreg objects for lmw_iv inputs). Other regression-related functions, such as coef(), residuals(), and fitted(), use the default methods and should work correctly with lmw_est objects.

Note that when fixed effects are supplied through the fixef argument to lmw() or lmw_iv(), standard error estimates computed using functions outside lmw may not be accurate due to issues relating to degrees of freedom. In particular, this affects conventional and HC1-robust standard errors. Otherwise, sandwich::vcovHC() can be used to compute standard errors (setting type = "const" for conventional standard errors), though sandwich::vcovCL() may not work as expected and should not be used. To calculate cluster-robust standard errors, supply an argument to cluster in lmw_est().

Note

lmw_est() uses non-standard evaluation to interpret its outcome argument. For programmers who wish to use lmw_est() inside other functions, an effective way to pass the name of an arbitrary outcome (e.g., y passed as a string) is to use do.call(), for example:

fun <- function(model, outcome, data) {
do.call("lmw_est", list(model, outcome, data)) } 

When using lmw_est() inside lapply() or purrr::map to loop over outcomes, this syntax must be used as well.

See Also

summary.lmw_est() for viewing and extracting the treatment effect and potential outcome mean estimates, standard errors, and other model statistics; lmw() or lmw_iv() for estimating the weights that correspond to the model estimated by lmw_est(); lm() and lm.fit() for fitting the corresponding model; ivreg() in the ivreg package for fitting 2SLS models; influence.lmw_est() for influence measures

Examples

data("lalonde")

# MRI regression for ATT
lmw.out1 <- lmw(~ treat + age + education + race + married +
                  nodegree + re74 + re75, data = lalonde,
                  estimand = "ATT", method = "MRI",
                  treat = "treat")

lmw.fit1 <- lmw_est(lmw.out1, outcome = "re78")
lmw.fit1

summary(lmw.fit1)


# MRI regression for ATT after propensity score matching
m.out <- MatchIt::matchit(treat ~ age + education + race +
                            married + nodegree + re74 + re75,
                          data = lalonde, method = "nearest",
                          estimand = "ATT")
lmw.out2 <- lmw(~ treat + age + education + race + married +
                  nodegree + re74 + re75, data = lalonde,
                method = "MRI", treat = "treat", obj = m.out)

## Using a cluster-robust SE with subclass (pair membership)
## as the cluster variable
lmw.fit2 <- lmw_est(lmw.out2, outcome = "re78", cluster = ~subclass)
lmw.fit2

summary(lmw.fit2)

# AIPW for ATE with MRI regression after propensity score
# weighting
ps <- glm(treat ~ age + education + race + married + nodegree +
            re74 + re75, data = lalonde,
            family = binomial)$fitted
ipw <- ifelse(lalonde$treat == 1, 1/ps, 1/(1-ps))

lmw.out3 <- lmw(re78 ~ treat + age + education + race + married +
                  nodegree + re74 + re75, data = lalonde,
                method = "MRI", treat = "treat",
                base.weights = ipw, dr.method = "AIPW")
lmw.fit3 <- lmw_est(lmw.out3)
lmw.fit3

summary(lmw.fit3)

# MRI for multi-category treatment ATE
lmw.out3 <- lmw(~ treat_multi + age + education + race + married +
                  nodegree + re74 + re75, data = lalonde,
                estimand = "ATE", method = "MRI",
                treat = "treat_multi")
lmw.fit3 <- lmw_est(lmw.out3, outcome = "re78")
lmw.fit3

summary(lmw.fit3)

ngreifer/lmw documentation built on Feb. 14, 2024, 10:53 p.m.