keilsim: Function for running simulations comparing bayesian and...

Description Usage Arguments Details

View source: R/keilsim.r


Simulates causal effects under setting detailed in section 4.2 of Keil et al. 2017 paper "A Bayesian approach to the g-formula". Simulation can be conducted under varying sample sizes, n, varying true risk difference values, RD, and either under a correct model, misspecified=FALSE, or misspecified model, misspecified=TRUE.


keilsim(n = 20, RD = 0, N_sims = 1000, mcmc_iter = 10000,
  warmup_iter = 9900, N_gcomp = 1000, boot_iter = 100,
  output_all = FALSE, misspecified = FALSE, parallel = FALSE,
  ncores = NULL)



scalar, positive interger greater than 1. This is the sample size in each simulate dataset.


scalar, numeric values. This represents the true risk difference under which data are simulated.


scalar, positive interger for number of simulated data sets to use.


scalar, positive interger for total number of MCMC draws to take from the posterior when performing Bayesian g-compuation. Should have at least 1000 posterior draws after a sufficient warm-up period.


scalar, positive interger for number of warm-up (aka burn-in) draws when performing Bayesian g-computation. Must be < mcmc_iter. The total number of draws used is mcmc_iter - warmup_iter. Recommended to use at least 1000 warm-up.


scalar, positive interger for number of MCMC iterations to use when performing the integral involved in g-computation. See Details.


number of nonparametric bootstrap resamples to use when performing frequentist g-computation. See Details.


logical (TRUE/FALSE). If TRUE, outputs estimate for each simulated dataset. If FALSE, just outputs summary statistics across all simulated datasets.


logical (TRUE/FALSE). If TRUE, both frequentist and Bayesian g-computation is performing without adjusting for confounding. If FALSE, both models correctly adjust for confounding.


logical (TRUE/FALSE). If TRUE, parallel processing is used to perform N_sims simulates in parallel. If FALSE, only single core is used. See Details.


scalar, positive interger for number of cores to use if parallel==TRUE. Must be between 1 and max cores. Use snow::detectCores() to find the maximum available cores on your machine.


keilsim() performs the simulation described in Keil et al 2017 for desired simulation settings. In each iteration, we perform a Bayesian g-computation and a frequentist g-computation. Bayesian models are estimated using STAN (which in turn back-ends to C++) in the back end. R's glm() is used to estimate frequentist models. Both g-computations use MCMC to evaluate the integral involved in g-computation. The number of MCMC iterations to use in this integration is gcomp_iter. Since the data generation process involves only two time periods and binary treatments and confounders, we only need about 100 g-computation interations to accurately estimate the integral. For the frequentist method, an interval estimate for the causal effect is calculated using nonparametric bootstrap. boot_iter resamples with replacement are used. Percentiles of the sampling distribution are used to form intervals. The function includes an option to run the N_sims simulations in parallel. This is enabled for both Macs and PCs by implementing doParallel, however **reproducibility is NOT gauranteed when running in parallel** since set.seed() is not respected. set.seed() is respected only when not running in parallel.

stablemarkets/KeilSim documentation built on May 22, 2019, 2:50 p.m.