Repeated Cross-fitting

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 6
)

Contents:

Repeated Cross-fitting

The purpose of repeated cross-fitting is to reduce the variability of estimate based on a specific split of data by summarizing estimates using different splits as suggested by Chernozhukov (2018).

Create an AIPW object

library(AIPW)
library(SuperLearner)
library(ggplot2)
set.seed(123)
data("eager_sim_obs")
cov = c("eligibility","loss_num","age", "time_try_pregnant","BMI","meanAP")

AIPW_SL <- AIPW$new(Y= eager_sim_obs$sim_Y,
                    A= eager_sim_obs$sim_A,
                    W= subset(eager_sim_obs,select=cov), 
                    Q.SL.library = c("SL.glm"),
                    g.SL.library = c("SL.glm"),
                    k_split = 2,
                    verbose=TRUE)$
  fit()$
  summary()

Decorate with Repeated class

# Create a new object from the previous AIPW_SL (Repeated class is an extension of the AIPW class)
repeated_aipw_sl <- Repeated$new(aipw_obj = AIPW_SL)
# Fit repetitively
repeated_aipw_sl$repfit(num_reps = 30, stratified = F)
# Summarise the median estimate, median SE, and the SE of median estimate adjusting for `num_reps` repetitions
repeated_aipw_sl$summary_median()
# Check the distributions of estiamtes from `num_reps` repetitions
s <- repeated_aipw_sl$repeated_estimates
ggplot2::ggplot(ggplot2::aes(x=Estimate),data = s) + ggplot2::geom_histogram(bins = 10) + ggplot2::facet_grid(~Estimand, scales = "free")
ggplot2::ggplot(ggplot2::aes(x=SE),data = s) + ggplot2::geom_histogram(bins = 10) + ggplot2::facet_grid(~Estimand, scales = "free")

More num_reps vs More k-split?

There are several considerations:

  1. Computational resources
  2. Sample size
  3. Complexity of the SuperLearner algorithms

References:

Chernozhukov V, Chetverikov V, Demirer M, et al (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal.



Try the AIPW package in your browser

Any scripts or data that you put into this service are public.

AIPW documentation built on April 12, 2025, 1:27 a.m.