knitr::opts_chunk$set(
  cache = TRUE,
  eval = TRUE,
  message = FALSE,
  echo = TRUE,
  results = 'asis',
  fig.height = 3.5,
  fig.width = 4.5,
  out.width = "100%",
  warning = FALSE,
  fig.align = 'center',
  message = FALSE,
  dev = 'cairo_pdf'
)

batpigday

\begin{description} \item[batpigday]{\emph{noun} The coding equivalent of groundhogday.} \end{description}

the problem


\begin{center} Simulating data is a bitch. \end{center}


Debugging frequently dominates the time of students in mathematical science. These students know how to solve equations, and next to nothing about code.


New tools[@wickham_tidyverse:_2017] are emerging daily to enable researchers to avoid these timesink pitfalls.

These tools have lowered the programmatic barrier for researchers, but it still a learning curve.


We consider a case study in meta-analysis.

\begin{description} \item[meta-analysis] Statistical methodology for combining the results of several studies.

\end{description}

meta-analysis of medians

Conventional meta-analytic tools, such as metafor::rma[@viechtbauer_conducting_2010], require an effect and a variance of that effect.


But what if the reported statistics are median and interquartile range? Existing estimators, such as [@wan_estimating_2014], estimate a mean and standard deviation.


To test our proposed estimator for the variance of the sample median, I found myself repeating tasks and checks in the algorithms.


I tried to find a better way of debugging and writing simulations.

This lead to:

  1. a packaged analysis[@marwick_packaging_2018], varameta::*, which is built on
  2. the simulation package for meta-analysis data, metasim::*.

(*in development)

\begin{center} \textbf{coding is the easiest part of coding} \end{center}

\columnbreak

escaping batpigday

Generate sample sizes for $k$ studies.

library(tidyverse)
library(metasim)
library(kableExtra)

# table styling
output_table <- function(df) {
  df %>% 
    janitor::adorn_rounding(skip_first_col = FALSE) %>% 
  kable(align = "c", caption = NULL, booktabs = TRUE) %>% 
  kable_styling(latex_options = c("striped","HOLD_position"),full_width = TRUE, font_size = 25)
}
# simulate 2 studies where most have at most 25
sim_n(k = 2, min_n = 10, max_n = 25) %>% output_table()
# generate simulation dataframe
sim_df() %>% head(2) %>% select(-n) %>% output_table()

Each row of this dataframe represents a set of simulation parameters. Each simulation runs a trial function.

metatrial() %>% output_table()

Each simulation reruns the trial function a given number of times.

metasim() %>% pluck("results") %>%
  select(-coverage_count) %>% output_table()

For all simulations, run metasim over each row of the dataframe.

sims <- metasims(trials = 100,
         trial_fn = metatrial,
         probar = FALSE) %>% 
  filter(measure == "lr") 

sims %>% 
  select(id, k, rdist, coverage, ci_width, bias) %>% 
  head() %>% 
  output_table()

\columnbreak

# plot
sims %>%
  ggplot(aes(x = rdist, y = coverage)) +
  geom_point(aes(colour = rdist), alpha = 0.4, position = "jitter") +
  facet_grid(k ~ tau2_true) + theme(
    axis.text.x = element_text(angle = 35, hjust = 1),
    legend.position = "none",
    plot.caption = element_text(hjust = 0)
  ) +
    hrbrthemes::scale_colour_ipsum() +
  labs(x = "Distribution",
       y = "Coverage")
ggsave("eshplot.png") 

knitr::include_graphics("eshplot.png")

\vfill\null

This poster was created with posterdown::@thorne_posterdown:_2019.

\small\printbibliography



softloud/metasim documentation built on July 15, 2019, 8:02 p.m.