mps.driver: Driver function for Modular Power Simulations
In tobyjohnson/gtx: Genetics ToolboX

Description Usage Arguments Details Value Author(s) Examples

Driver function that calls modules (functions) for separate parts of a power simulation.

mps.driver(design, xFun, yFun, tFun, nrep = 1000,
           alpha = 0.05, sided = "two-sided", verbose = FALSE)
mps.driver1(design1, xFun, yFun, tFun, nrep = 1000)
mps.summary(zz, alpha = 0.05, sided = "two-sided")

`design`	A data frame of design points
`xFun`	A function to generate independent data, evaluated once for each design point
`yFun`	A function to generate dependent data, evaluated `nrep` times for each design point
`tFun`	A list of functions to test each realisation of the dependent data
`nrep`	The number of replicates for simulating dependent data for each design point
`alpha`	A vector of test significance levels
`sided`	A vector of test sidedness
`verbose`	Whether to print progress messages
`design1`	One row of a design matrix
`zz`	Output of mps.driver1

The idea of Modular Power Simulations is to provide a generic driver function mps.driver(), to which modules (other functions such as ssm.QT(), ssm.LM(), etc.) are passed to specify the simulation model and the test statistic(s) to be evaluated. The driver function handles rote looping over design points, and processing of the test statistics from replicate simulations to calculate power at different alpha levels and test sidednesses, bias, and other summaries. A modular framework is intended to facilitate code re-use, because individual module functions can be re-used or modified for different applications.

The practical use of mps.driver() is best illustrated by the examples and vignettes. This documentation concentrates on the technical specification.

The modularity is implemented by defining two separate functions that specify the data generation mechanism, and a further list of separate functions that specify test statistic(s). The data generation mechanism is separated into xFun and yFun. xFun will be evaluated only once for each design point, is typically deterministic, and may generate independent variables in regression-like applications. yFun will be evaluated many replicate times for each design point, is typically stochastic, and may generate dependent variables in regression-like applications. The return value from xFun is passed as an argument to yFun; this can be any data structure.

Parameter values for the simulation model may be specified for each design point, or may be specified globally. This is achieved by using free (unbound) variables inside the functions for individual modules. For example, in a function to simulate randomised treatment status:

1	simpleTreatment <- function() return(rbinom(sampleSize, 1, 0.5))

the variable sampleSize is a free variable. To use this function with mps.driver(), there must either be a column in the design matrix called “sampleSize”, or sampleSize must be defined in the global environment.

To simulate for the parameter values at a particular the design point, the driver function manipulates the environment of the simulation functions, essentially by doing:

design1 <- design[designIdx, , drop = FALSE]
with(as.list(design1), {
  environment(xFun) <- environment()
  environment(yFun) <- environment()
  x1 <- xFun()
  y1 <- yFun(x1)
  ...
})

to simulate one one realisation (x1, y1) for the designIdx-th design point.

The return values of xFun and yFun can be any data structures; they are passed as arguments to a function or functions that calculate the test statistics for which power calculations are required. The test statistic(s) are specified using tFun, which should be a named list of functions. Note if tFun is a single function, it will be made into a list using

1	tFun <- list(test = tFun)

There is a very specific requirement that each element of the list tFun be a function whose return value can be coerced to a double of constant length, with at least a named element “pval”. It is strongly recommended to use or extend the generic function test.extract to achieve this, e.g. as in

1 2	tFun = list(test1 = function(x, y) return(test.extract(lm(phenotype ~ genotypeA, data = y))))

See ssm.LM, ssm.GLM, ssm.CoxPH for further examples.

mps.driver() treats each row of the design argument as a separate design point, and generates nrep replicate simulations. Each test function is applied to the same set of replicate simulations, and power is evaluated at significance levels alpha and test sidednesses sided.

A data frame of results. There is one row for each combination of design point, test function, and significance/sidedness. That is, the return value will have nrow(design)*length(tFun)*length(alpha) rows.

Toby Johnson Toby.x.Johnson@gsk.com

design <- expand.grid(sampleSize = c(1000, 2000),
                      alleleFrequency = c(0.1, 0.3, 0.5),
                      effectSize = c(0.1, 0.2),
                      dominanceCoeff = 0)
mps.driver(design, ssm.null, ssm.QT, list(lm = ssm.LM),
           nrep = 100)
# run with larger nrep for better results