FastRerandomize Package Tutorial
In fastrerandomize: Hardware-Accelerated Rerandomization for Improved Balance

Introduction

This vignette demonstrates how to use the fastrerandomize package for generating and testing experimental designs in an agricultural study setting based on:

Burchardi, K.B., Gulesci, S., Lerva, B. and Sulaiman, M., 2019. Moral hazard: Experimental Evidence from Tenancy Contracts. The Quarterly Journal of Economics, 134(1), pp.281-347.

We will walk through:

Generating synthetic covariate data
Creating candidate randomizations
Exploring the S3 methods for printing, summarizing, and plotting randomization objects
Conducting a randomization-based inference test

Note: This tutorial assumes you have installed the fastrerandomize package, either from source or via GitHub:

# If you haven't installed or set up the package:
# devtools::install_github("cjerzak/fastrerandomize-software/fastrerandomize")

# (Done once) Optionally build the JAX backend if needed
# fastrerandomize::build_backend()

# note that this vignette can be found like so: 
# vignette(package = "fastrerandomize")

1. Obtain Pre-treatment Covariate Data

For illustration, we use several pre-treatment covariates from the original experiment.

library(fastrerandomize)

# Obtain pre-treatment covariates 
data(QJEData, package = "fastrerandomize")
myCovariates <- c("children","married","hh_size","hh_sexrat")
QJEData <- QJEData[!is.na(rowSums(QJEData[,myCovariates])),]
X <- QJEData[,myCovariates]

# Analysis parameters
n_units   <- nrow(X)
n_treated <- round(nrow(X)/2)

2. Generate Randomizations

We now create our candidate randomizations using the function generate_randomizations() from fastrerandomize. Depending on the scale and complexity of your experiment, you may choose either exact enumeration ("exact") or Monte Carlo ("monte_carlo") randomization.

In this example, we'll use the exact approach, but note that randomization_accept_prob is set very low (0.0001) for demonstration, so that we can see some filtering in action.

RunMainAnalysis <- (!is.null(check_jax_availability()) )
# if FALSE, consider calling build_backend()

if(RunMainAnalysis){
CandRandomizations <- generate_randomizations(
  n_units = n_units,
  n_treated = n_treated,
  X = X,
  randomization_type = "monte_carlo", 
  max_draws = 100000L,
  batch_size = 1000L,
  randomization_accept_prob = 0.0001
)
}

4. S3 Methods: Print, Summary, and Plot

generate_randomizations() returns an S3 object of class fastrerandomize_randomizations. We can use its associated print, summary, and plot methods to inspect the candidate randomizations.

if(RunMainAnalysis){
# 4a. Print the object
print(CandRandomizations)

# 4b. Summary
summary(CandRandomizations)

# 4c. Plot the balance distribution
plot(CandRandomizations)
}

3. Randomization Test

Next, we showcase a randomization-based inference procedure using fastrerandomize.

3a. Setup Simulated Outcomes

We simulate an outcome Y that depends on our covariates X, the chosen randomization, and some true treatment effect τ. Here, we pick the first acceptable randomization from our set as our "observed" assignment, Wobs.

if(RunMainAnalysis){
CoefY <- rnorm(ncol(X))          # Coefficients for outcome model
Wobs  <- CandRandomizations$randomizations[1, ]  # use the 1st acceptable randomization
tau_true <- 1                     # true treatment effect
# Generate Y = X * Coef + W*tau + noise
Yobs <- as.numeric( as.matrix(X) %*% as.matrix(CoefY) + Wobs * tau_true + rnorm(n_units, sd = 0.1))
}

3b. Run the Randomization Test

We pass our observed treatment assignment (obsW), observed outcomes (obsY), and the full matrix of candidate randomizations (candidate_randomizations) to randomization_test(). This test:

Computes the observed difference in means
Permutes the treatment assignment across our candidate randomizations
Estimates the p-value by comparing the observed statistic to its permutation distribution

if(RunMainAnalysis){
randomization_test_results <- randomization_test(
  obsW = Wobs,
  obsY = Yobs,
  candidate_randomizations = CandRandomizations$randomizations,
  findFI = TRUE
)

# Inspect results
print(randomization_test_results)
summary(randomization_test_results)
plot(randomization_test_results)
}

The findFI = TRUE flag further attempts to locate a fiducial interval (FI) for the treatment effect.

Conclusion

This tutorial demonstrated a basic workflow with fastrerandomize:

Generating synthetic or real covariate data
Producing a set of acceptable treatment assignments via randomization
Visualizing and summarizing these assignments
Testing for treatment effects with a randomization-based approach

We encourage you to consult the package documentation for more advanced functionalities, including GPU-accelerated computations via JAX and flexible definitions of balance criteria in addition to what was shown here.