In aosmith16/simpow: Simulation-based power analysis

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

simpow: Simulation-based power analysis

The goal of this package is to provide helper functions for simple power analyses based on linear and linear mixed models (LM's and LMM's). Support may be added for generalized linear mixed models (GLMM's) in the future

Installation

You can install the simpow from GitHub:

devtools::install_github("aosmith16/simpow")

Simulated power analysis for basic LMM

The primary function in simpow at the moment is called lmm_f(), which uses nlme::lme() to fit models under the hood.

Use this function to run a simulation-based power analysis of a LMM with a single random effect ("block") and a single, categorical fixed effect called "treatment".

By default there are two groups for the treatment variable (ntrt = 2). You need to provide the "true" means for those groups as well as a block and observation-level (residual) standard deviation.

Here is the power to detect that at least one of the treatment means is different based on 100 simulations using an alpha of 0.05. I set the treatment means to 1 and 4 and standard deviations of 2 and 4 for the block random effect and residual error term.

You will generally want more than 100 simulations. I use that here to save time. I set the seed to make these results reproducible but most often you will not set the seed when using functions from simpow.

library(simpow)
set.seed(16) 
lmm_f(nsim = 100,
      ntrt = 2,
      trtmeans = c(1, 4),
      nblock = 10,
      sd_block = 2,
      sd_resid = 4)

Add more treatment groups by changing ntrt away from the default.

set.seed(16) 
lmm_f(nsim = 100,
      ntrt = 3,
      trtmeans = c(1, 3, 4),
      nblock = 10,
      sd_block = 2,
      sd_resid = 4)

The default power analysis has a single replicate per treatment per block like you might have in many completely randomized block designs. This can be changed. Here there are 5 replicates for every treatment in every block for a total of 100 observations in each dataset.

set.seed(16) 
lmm_f(nsim = 100,
      ntrt = 2,
      trtmeans = c(1, 4),
      nblock = 10,
      nrep = 5,
      sd_block = 2,
      sd_resid = 4)

Nonconstant variance

If it makes sense to allow different variances per treatment group, use sd_eq = FALSE. You then must provide a separate standard deviation for every treatment group via sd_resid or you will get an error.

These models tend to fit more slowly than models assuming constant variance of the errors so run times will be longer.

set.seed(16) 
lmm_f(nsim = 100,
      ntrt = 2,
      trtmeans = c(1, 4),
      nblock = 10,
      nrep = 5,
      sd_block = 2,
      sd_resid = c(1, 4),
      sd_eq = FALSE)

Simulated datasets

If you want to create the simulate datasets instead of (or in addition to) doing the power analysis, use keep_data = TRUE. Note the use test = "none" to skip the power analysis all together, where the default test = "overall" does a power analysis on the overall F test of the fixed effect.

The dataset can then by extracted from the output object via data.

set.seed(16) 
results = lmm_f(nsim = 2,
                test = "none",
                ntrt = 2,
                trtmeans = c(1, 4),
                nblock = 5,
                sd_block = 2,
                sd_resid = 4,
                keep_data = TRUE)
results$data

Similarly you can keep fitted models via keep_models = TRUE and extracting via models. (Not shown.) Other values you can extract from the returned object is the vector of p-values (p.values).

Simulated power for two-factor LM

When using very few blocks it is not uncommon to treat the blocking variable as fixed instead of random. Instead of a basic LMM we'd want a two-factor linear model (LM) with no interaction. You can fit such a model in simpow using the lm_2f() function.

The main difference from lmm_f() is that you provide the true treatment-block means instead of a block standard deviation.

set.seed(16)
lm_2f(nsim = 10,
      allmeans = c(30, 40, 45, 55, 45, 55),
      ntrt = 2,
      nblock = 3,
      sd_resid = 4)

You can provide vectors for the treatment means and block means separately instead of providing the combined means. However, the overall means between the two vectors must be the same and it may be simpler to provide allmeans as above.

set.seed(16)
lm_2f(nsim = 10,
      ntrt = 2,
      trtmeans = c(40, 50),
      nblock = 3,
      blockmeans = c(35, 50, 50),
      sd_resid = 4)

Power when varying parameters or design elements

You can use the vary_element() to run a power analysis for different values of a parameter or study design element. This allows the user to explore how best to set up their study or what effect they can realistically expect to detect. Note only 1 element may be varied at a time.

However, using this function means doing an entire simulation multiple times and so ultimately this may be very slow when using 1000 simulations in each analysis. For this reason you may want to skip doing extremely fine-scale changes.

Here's an example of allowing for more blocks, holding all other arguments constant for the power analysis. I unrealistically do only 10 simulations to save running time.

set.seed(16)
vary_element(simfun = "lmm_f",
             tovary = "nblock",
             values = c(5, 15),
             nsim = 100,
             trtmeans = c(1, 2),
             sd_block = 2,
             sd_resid = 4)

You can vary values to arguments that already take vectors by passing a list. For example, if you want to vary the treatment effect pass a list of vectors.

set.seed(16)
vary_element(simfun = "lmm_f",
             tovary = "trtmeans",
             values = list(c(1, 4), c(1, 10)),
             nsim = 100,
             nblock = 5,
             sd_block = 2,
             sd_resid = 4)

Change the simfun to explore a different simulation function. For lm_2f() for limited-block-number designs this will probably be most useful for varying the number of replicates per treatment per block, residual standard deviation, or size of the effect of interest.

set.seed(16)
vary_element(simfun = "lm_2f",
             tovary = "nrep",
             values = c(1, 10),
             nsim = 100,
             allmeans = c(30, 40, 45, 55, 45, 55),
             sd_resid = 4)

Note that in the lm_wf() case above, trying to vary the block number would add a complexity var_element() can't handle because you would also need to provide different block means. This is not currently possible.

aosmith16/simpow documentation built on Dec. 19, 2021, 3:41 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com