knitr::opts_chunk$set( cache = TRUE, collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
Conduct randomization inference on R model objects.
This R package is a port of the excellent
-ritest-
Stata routine by Simon Heß. It
doesn't (yet) try to support all of the features in the Stata version and is
currently limited to lm()
and fixest::feols()
models. But it does appear to
be significantly faster, and aims to support a variety of model classes once it
is fully baked.
# install.packages("remotes") remotes::install_github("grantmcdermott/ritest")
A detailed walkthrough of the package is provided in the introductory
vignette. See vignette("ritest")
.
As a quickstart, here follows a basic example using data from a randomized control trial (RCT) that was conducted in Colombia. The dataset is provided with this package.
First, we use the fixest::feols()
function to estimate the parametric model.
library(ritest) ## This package library(fixest) ## For fast (high-dimensional) fixed-effect regressions data("colombia") ## Parametric model using fixest::feols() co_est = feols( dayscorab ~ b_treat + b_dayscorab | b_pair + miss_b_dayscorab + round2 + round3, vcov = ~b_block, data = colombia ) co_est
Our key treatment variable (b_treat
) is deemed to be statistically significant
(p-value of 0.024), even as we cluster the standard errors.
But let's see if this result is robust to randomization inference (RI). We'll
perform 1,000 RI permutations on b_treat
, whilst taking into account the
stratified and clustered experimental design of the underlying RCT.
ritest(co_est, 'b_treat', strata='b_pair', cluster='b_block', reps=1e3, seed=1234)
The simulated p-value is noticeably larger than the parametric one (0.107 vs 0.024), suggesting that our parametric model is overstating the effectiveness of treatment.
For more examples and additional features --- plotting, regression tables, etc. --- please see the introductory vignette.
I generally observe a speed increase of between 25x--50x compared to the Stata version. Again, see the introductory vignette for timed examples.
Apart from the Stata -ritest-
routine,
there are several other packages for conducting randomization inference in R.
For example, the ri
package has been available for nearly a decade. More recently, the successor ri2 package
extends upon the original, with updated syntax and functionality. An advantage
of ri2 is that it integrates with the wider
DeclareDesign suite of R packages for experimental
and empirical research design. This enables researchers to build RI and other
experimental considerations into the incipient design process. On the other
hand, this places some restrictions on conducting RI ex post or in
quasi-experimental settings (e.g. a study that leverages a natural experiment).
For example, you can't pass an existing regression model object to
ri2::conduct_ri()
, which is what ritest was designed for. Your use case
will likely determine which software is optimal for you.
The software code contained within this repository is made available under the MIT license.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.