knitr::opts_chunk$set(echo = TRUE)
We have a Bayesian hierarchical model to estimate the extent of incorrect classification of maternal death (misclassification)
Misclassification is broken down into 2 parameters: 1. Sensitivity: True positive rate 2. Specificity: True negative rate
Our Aim: Compare bias across different data settings using simulation study results and automate our simulation output/comparison using an R package. Data Settings Implemented: 1. Full multinomial data (alldata) 2. True maternal (TP + FN) and Observed maternal (TP + FP) (nobreakdown)
For the purpose of this example analysis, we will not perform simulation exercises, but work with toy MCMC sample datasets to show the automated output functions.
Below we go through the process of producing simulation output between two run settings. First we specify filepaths and read in libraries
toy.dir <- paste(getwd(), "/toy_mcmc/", sep="") output.toy.dir <- paste(getwd(), "/toy_mcmc/", sep="") library(runjags) library(R2jags) library(coda) library(truncnorm) library(tidyverse) library(xtable) library(devtools) library(testthat) load_all()
#Set the true for global values. These are values I am fixing. global_settings <- c(sensworld = 0.5, #global sens value specworld = 0.9, #global spec value rho.alphabeta = -0.3, #correlation between sens and spec rrho.alphabeta = -0.3, #correlation between sens and spec in reference year rsigma.alpha = 0, #sd for sens in ref year rsigma.beta = 0, #sd for spec in ref year sigma.alpha = 0.2, #sd for sens in RW sigma.beta = 0.15, #sd for spec in RW C = 1, #number of countries to simulate Nyears = 15, #number of years to simulate, refyear = 10, #set reference year vrenv = 5000 #starting vr env, vr.ct is binomial given vrenv to generate year specific vrenv ) globalvars <- c("sensworld", "specworld", "rho.alphabeta", "sigma.alpha", "sigma.beta") nsim = 2
#Run the simulation and get MCMC samples and their posterior estimates. Saved samples to specified output folder get_simu_output(runname = runname, global_settings = global_settings, output.dir = output.toy.dir, nsim)
\bigskip
We pull in the MCMC samples and get posterior summaries of median, lower, upper, and calculate bias for each country-year estimate for each setting and simulation number using summarize_simulation.
We combine all the iteration results in summarize_runsetting to produce 1 tibble of summaries.
settings <- c("setting1", "setting2") for(sett in settings){ for(i in 1:nsim){ #Summarizes estimates and bias for each simulation summarize_simulation(simsetting = sett, runname = sett, simnum = i, globalvars = gobalvars, output.dir = paste(output.toy.dir, sett, "/", sep="") ) } summary_tib <<- summarize_runsetting(output.dir = paste(output.toy.dir, sett, "/", sep=""), simsetting = sett, nsim = nsim) } head(summary_tib[100:110,])
Model performance is assessed using (1) Bias and (2) Coverage (shown in output table)
#Specify the models you want to compare runnames = c("setting1", "setting2") #Select parameters you want to plot plotpars <- c("sens.ct[1,10]", "spec.ct[1,10]") #Get xtable with simulation output output <- sum_across_sim(runsettings = runnames) head(output[50:60,]) #Get comparison plots saved to figure directory produce_plots(runnames = runnames, plotpars = plotpars, fig.dir = paste(output.toy.dir) )
We use the tibble created by \textit{summarize_runsetting.R} to compare the truth with posterior estimates by simulation using \textit{plot_nsim.R} function. This plot compares truth vs estimates across simulations.
We plot the distribution of posterior median estimates against the truth using the \textit{plot_box.R} function.
We show example graphical output from our two toy settings.
outext = output.toy.dir BuildDF(outext = outext, runnames) dftot <- readRDS(paste(outext = outext, "DFtot.RDS", sep="")) dfsub = readRDS(paste(outext = outext, "DFsubset.RDS", sep="")) print(plot_nsim(df = dfsub)) print(plot_box(df = dfsub, pars = plotpars))
Lastly, the \textit{summarize_simulation.R} is tested using the \textit{summarize_simulation_error.R} located in the test folder. Using the testthat package, we write an error which stops the summarize function if different parameters are specified from the ones the script is built to analyze. The expected output from this "error" function is a table with 6 parameters (15 ct estimates for 4 parameters, and 2 estimates for 2 global parameters) and their point estimates and uncertainty intervals if given "workingtib" (drawn from mcmctib_n) and a stop error if given "errortib" (generated with 4 variables). Using the \textit{expect_error} function, we can confirm that an error is returned if parameters of interest are not in the data inputs to the summary function.
errortib <- readRDS(paste(getwd(), '/tests/', 'errortib.RDS', sep="")) expect_error(summarize_simulation_error(errortib)) test_that("function gives error when the data is missing parameters", { expect_error(summarize_simulation_error(errortib)) })
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.