knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of this package is to provide helper functions for simple power analyses based on linear and linear mixed models (LM's and LMM's). Support may be added for generalized linear mixed models (GLMM's) in the future
You can install the simpow from GitHub:
devtools::install_github("aosmith16/simpow")
The primary function in simpow at the moment is called lmm_f()
, which uses nlme::lme()
to fit models under the hood.
Use this function to run a simulation-based power analysis of a LMM with a single random effect ("block") and a single, categorical fixed effect called "treatment".
By default there are two groups for the treatment variable (ntrt = 2
). You need to provide the "true" means for those groups as well as a block and observation-level (residual) standard deviation.
Here is the power to detect that at least one of the treatment means is different based on 100 simulations using an alpha of 0.05. I set the treatment means to 1 and 4 and standard deviations of 2 and 4 for the block random effect and residual error term.
You will generally want more than 100 simulations. I use that here to save time. I set the seed to make these results reproducible but most often you will not set the seed when using functions from simpow.
library(simpow) set.seed(16) lmm_f(nsim = 100, ntrt = 2, trtmeans = c(1, 4), nblock = 10, sd_block = 2, sd_resid = 4)
Add more treatment groups by changing ntrt
away from the default.
set.seed(16) lmm_f(nsim = 100, ntrt = 3, trtmeans = c(1, 3, 4), nblock = 10, sd_block = 2, sd_resid = 4)
The default power analysis has a single replicate per treatment per block like you might have in many completely randomized block designs. This can be changed. Here there are 5 replicates for every treatment in every block for a total of 100 observations in each dataset.
set.seed(16) lmm_f(nsim = 100, ntrt = 2, trtmeans = c(1, 4), nblock = 10, nrep = 5, sd_block = 2, sd_resid = 4)
If it makes sense to allow different variances per treatment group, use sd_eq = FALSE
. You then must provide a separate standard deviation for every treatment group via sd_resid
or you will get an error.
These models tend to fit more slowly than models assuming constant variance of the errors so run times will be longer.
set.seed(16) lmm_f(nsim = 100, ntrt = 2, trtmeans = c(1, 4), nblock = 10, nrep = 5, sd_block = 2, sd_resid = c(1, 4), sd_eq = FALSE)
If you want to create the simulate datasets instead of (or in addition to) doing the power analysis, use keep_data = TRUE
. Note the use test = "none"
to skip the power analysis all together, where the default test = "overall"
does a power analysis on the overall F test of the fixed effect.
The dataset can then by extracted from the output object via data
.
set.seed(16) results = lmm_f(nsim = 2, test = "none", ntrt = 2, trtmeans = c(1, 4), nblock = 5, sd_block = 2, sd_resid = 4, keep_data = TRUE) results$data
Similarly you can keep fitted models via keep_models = TRUE
and extracting via models
. (Not shown.) Other values you can extract from the returned object is the vector of p-values (p.values
).
When using very few blocks it is not uncommon to treat the blocking variable as fixed instead of random. Instead of a basic LMM we'd want a two-factor linear model (LM) with no interaction. You can fit such a model in simpow using the lm_2f()
function.
The main difference from lmm_f()
is that you provide the true treatment-block means instead of a block standard deviation.
set.seed(16) lm_2f(nsim = 10, allmeans = c(30, 40, 45, 55, 45, 55), ntrt = 2, nblock = 3, sd_resid = 4)
You can provide vectors for the treatment means and block means separately instead of providing the combined means. However, the overall means between the two vectors must be the same and it may be simpler to provide allmeans
as above.
set.seed(16) lm_2f(nsim = 10, ntrt = 2, trtmeans = c(40, 50), nblock = 3, blockmeans = c(35, 50, 50), sd_resid = 4)
You can use the vary_element()
to run a power analysis for different values of a parameter or study design element. This allows the user to explore how best to set up their study or what effect they can realistically expect to detect. Note only 1 element may be varied at a time.
However, using this function means doing an entire simulation multiple times and so ultimately this may be very slow when using 1000 simulations in each analysis. For this reason you may want to skip doing extremely fine-scale changes.
Here's an example of allowing for more blocks, holding all other arguments constant for the power analysis. I unrealistically do only 10 simulations to save running time.
set.seed(16) vary_element(simfun = "lmm_f", tovary = "nblock", values = c(5, 15), nsim = 100, trtmeans = c(1, 2), sd_block = 2, sd_resid = 4)
You can vary values to arguments that already take vectors by passing a list. For example, if you want to vary the treatment effect pass a list of vectors.
set.seed(16) vary_element(simfun = "lmm_f", tovary = "trtmeans", values = list(c(1, 4), c(1, 10)), nsim = 100, nblock = 5, sd_block = 2, sd_resid = 4)
Change the simfun
to explore a different simulation function. For lm_2f()
for limited-block-number designs this will probably be most useful for varying the number of replicates per treatment per block, residual standard deviation, or size of the effect of interest.
set.seed(16) vary_element(simfun = "lm_2f", tovary = "nrep", values = c(1, 10), nsim = 100, allmeans = c(30, 40, 45, 55, 45, 55), sd_resid = 4)
Note that in the lm_wf()
case above, trying to vary the block number would add a complexity var_element()
can't handle because you would also need to provide different block means. This is not currently possible.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.