Translating Zelig to clarify
In clarify: Simulation-Based Inference for Regression Models

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  fig.width = 6.5,
  fig.height = 2.75
)

Introduction

In this document, we demonstrate some common uses of Zelig [@imaiCommonFrameworkStatistical2008a] and how the same tasks can be performed using clarify. We'll include examples for computing predictions at representative values (i.e., setx() and sim() in Zelig), the rare-events logit model, estimating the average treatment effect (ATT) after matching, and combining estimates after multiple imputation.

The usual workflow in Zelig is to fit a model using zelig(), specify quantities of interest to simulate using setx() on the zelig() output, and then simulate those quantities using sim(). clarify uses a similar approach, except that the model is fit outside clarify using functions in a different R package. In addition, clarify's sim_apply() allows for the computation of any arbitrary quantity of interest. Unlike Zelig, clarify follows the recommendations of @raineyCarefulConsiderationCLARIFY2023 to use the estimates computed from the original model coefficients rather than the average of the simulated draws. We'll demonstrate how to replicate a standard Zelig analysis using clarify step-by-step. Because simulation-based inference involves randomness and some of the algorithms may not perfectly align, one shouldn't expect results to be identical, though in most cases, they should be similar.

## library("Zelig")
library("clarify")
set.seed(100)

Note that both Zelig and clarify have a function called "sim()", so we will always make it clear which package's sim() is being used.

Predictions at representative values

Here we'll use the lalonde dataset in {MatchIt} and fit a linear model for re78 as a function of the treatment treat and covariates.

data("lalonde", package = "MatchIt")

We'll be interested in the predicted values of the outcome for a typical unit at each level of treatment and their first difference.

`Zelig` workflow

In Zelig, we fit the model using zelig():

fit <- zelig(re78 ~ treat + age + educ + married + race +
               nodegree + re74 + re75, data = lalonde,
             model = "ls", cite = FALSE)

Next, we use setx() and setx1() to set our values of treat:

fit <- setx(fit, treat = 0)
fit <- setx1(fit, treat = 1)

Next we simulate the values using sim():

fit <- Zelig::sim(fit)

Finally, we can print and plot the predicted values and first differences:

fit

plot(fit)

`clarify` workflow

In clarify, we fit the model using functions outside clarify, like stats::lm()or fixest::feols().

fit <- lm(re78 ~ treat + age + educ + married + race +
            nodegree + re74 + re75, data = lalonde)

Next, we simulate the model coefficients using clarify::sim():

s <- clarify::sim(fit)

Next, we use sim_setx() to set our values of the predictors:

est <- sim_setx(s, x = list(treat = 0), x1 = list(treat = 1),
                verbose = FALSE)

Finally, we can summarize and plot the predicted values:

summary(est)

plot(est)

Rare-events logit

Zelig uses a special method for logistic regression with rare events as described in @kingLogisticRegressionRare2001. This is the primary implementation of the method in R. However, newer methods have been developed that perform similarly to or better than the method of King and Zeng [@puhrFirthLogisticRegression2017] and are implemented in R packages that are compatible with clarify, such as logistf and brglm2.

Here, we'll use the lalonde dataset with a constructed rare outcome variable to demonstrate how to perform a rare events logistic regression in Zelig and in clarify.

data("lalonde", package = "MatchIt")

#Rare outcome: 1978 earnings over $20k; ~6% prevalence
lalonde$re78_20k <- lalonde$re78 >= 20000

`Zelig` workflow

In Zelig, we fit a rare events logistic model using zelig() with model = "relogit".

fit <- zelig(re78_20k ~ treat + age + educ + married + race +
               nodegree + re74 + re75, data = lalonde,
             model = "relogit", cite = FALSE)

fit

We can compute predicted values at representative values using setx() and Zelig::sim() as above.

fit <- setx(fit, treat = 0)
fit <- setx1(fit, treat = 1)

fit <- Zelig::sim(fit)

fit

plot(fit)

`clarify` workflow

Here, we'll use logistf::logistif() with flic = TRUE, which performs a variation on Firth's logistic regression with a correction for bias in the intercept [@puhrFirthLogisticRegression2017].

fit <- logistf::logistf(re78_20k ~ treat + age + educ + married + race +
                          nodegree + re74 + re75, data = lalonde,
                        flic = TRUE)

summary(fit)

We can compute predictions at representative values using clarify::sim() and sim_setx().

s <- clarify::sim(fit)

est <- sim_setx(s, x = list(treat = 0), x1 = list(treat = 1),
                verbose = FALSE)

summary(est)

plot(est)

Estimating the ATT after matching

Here we'll use the lalonde dataset and perform propensity score matching and then fit a linear model for re78 as a function of the treatment treat, the covariates, and their interaction. From this model, we'll compute the ATT of treat using Zelig and clarify.

data("lalonde", package = "MatchIt")

m.out <- MatchIt::matchit(treat ~ age + educ + married + race +
                            nodegree + re74 + re75, data = lalonde,
                          method = "nearest")

`Zelig` workflow

In Zelig, we fit the model using zelig() directly on the matchit object:

fit <- zelig(re78 ~ treat * (age + educ + married + race +
                               nodegree + re74 + re75),
             data = m.out, model = "ls", cite = FALSE)

Next, we use ATT() to request the ATT of treat and simulate the values:

fit <- ATT(fit, "treat")

fit

plot(fit)

`clarify` workflow

In clarify, we need to extract the matched dataset and fit a model outside clarify using another package.

m.data <- MatchIt::match.data(m.out)

fit <- lm(re78 ~ treat * (age + educ + married + race +
                            nodegree + re74 + re75),
          data = m.data)

Next, we simulate the model coefficients using clarify::sim(). Because we performed pair matching, we will request a cluster-robust standard error:

s <- clarify::sim(fit, vcov = ~subclass)

Next, we use sim_ame() to request the average marginal effect of treat within the subset of treated units:

est <- sim_ame(s, var = "treat", subset = treat == 1,
               contrast = "diff", verbose = FALSE)

Finally, we can summarize and plot the ATT:

summary(est)

plot(est)

Combining results after multiple imputation

Here we'll use the africa dataset in {Amelia} to demonstrate combining estimates after multiple imputation. This analysis is also demonstrated using clarify at the end of vignette("clarify").

library(Amelia)
data("africa", package = "Amelia")

First we multiply impute the data using amelia() using the specification in the {Amelia} documentation.

# Multiple imputation
a.out <- amelia(x = africa, m = 10, cs = "country",
                ts = "year", logs = "gdp_pc", p2s = 0)

`Zelig` workflow

With Zelig, we can supply the amelia object directly to the data argument of zelig() to fit a model in each imputed dataset:

fit <- zelig(gdp_pc ~ infl * trade, data = a.out,
             model = "ls", cite = FALSE)

Summarizing the coefficient estimates after the simulation can be done using summary():

summary(fit)

We can use Zelig::sim() and setx() to compute predictions at specified values of the predictors:

fit <- setx(fit, infl = 0, trade = 40)
fit <- setx1(fit, infl = 0, trade = 60)

fit <- Zelig::sim(fit)

Zelig does not allow you to combine predicted values across imputations.

fit

plot(fit)

`clarify` workflow

clarify does not combine coefficients, unlike zelig(); instead, the models should be fit using Amelia::with(). To view the combined coefficient estimates, use Amelia::mi.combine().

#Use Amelia functions to model and combine coefficients
fits <- with(a.out, lm(gdp_pc ~ infl * trade))

mi.combine(fits)

Derived quantities can be computed using clarify::misim() and sim_apply() or its wrappers on the with() output, which is a list of regression model fits:

#Simulate coefficients, 100 in each of 10 imputations
s <- misim(fits, n = 100)

#Compute predictions at specified values
est <- sim_setx(s, x = list(infl = 0, trade = 40),
                x1 = list(infl = 0, trade = 60),
                verbose = FALSE)

summary(est)

plot(est)

References

Any scripts or data that you put into this service are public.

clarify documentation built on June 22, 2024, 12:09 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

clarify
Simulation-Based Inference for Regression Models

Translating Zelig to clarify
In clarify: Simulation-Based Inference for Regression Models

Introduction

Predictions at representative values

`Zelig` workflow

`clarify` workflow

Rare-events logit

`Zelig` workflow

`clarify` workflow

Estimating the ATT after matching

`Zelig` workflow

`clarify` workflow

Combining results after multiple imputation

`Zelig` workflow

`clarify` workflow

References

Try the clarify package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

clarify Simulation-Based Inference for Regression Models

Translating Zelig to clarify In clarify: Simulation-Based Inference for Regression Models

Introduction

Predictions at representative values

Zelig workflow

clarify workflow

Rare-events logit

Zelig workflow

clarify workflow

Estimating the ATT after matching

Zelig workflow

clarify workflow

Combining results after multiple imputation

Zelig workflow

clarify workflow

References

Try the clarify package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

clarify
Simulation-Based Inference for Regression Models

Translating Zelig to clarify
In clarify: Simulation-Based Inference for Regression Models

`Zelig` workflow

`clarify` workflow

`Zelig` workflow

`clarify` workflow

`Zelig` workflow

`clarify` workflow

`Zelig` workflow

`clarify` workflow