cps.binary: Power simulations for cluster-randomized trials: Parallel...
In clusterPower: Power Calculations for Cluster-Randomized and Cluster-Randomized Crossover Trials

Description Usage Arguments Details Value Testing details Author(s) References See Also Examples

\loadmathjax

This function uses Monte Carlo methods (simulations) to estimate power for cluster-randomized trials. Users can modify a variety of parameters to suit the simulations to their desired experimental situation.

Users must specify the desired number of simulations, number of subjects per cluster, number of clusters per arm, and two of the following three parameters: expected probability of the outcome in one group, expected probability of the outcome in the second group, and expected difference in probabilities between groups. Default values are provided for significance level, analytic method, progress updates, and whether the simulated data sets are retained.

cps.binary(
  nsim = NULL,
  nsubjects = NULL,
  nclusters = NULL,
  p1 = NULL,
  p2 = NULL,
  sigma_b_sq = NULL,
  sigma_b_sq2 = NULL,
  alpha = 0.05,
  method = "glmm",
  quiet = FALSE,
  allSimData = FALSE,
  seed = NA,
  nofit = FALSE,
  poorFitOverride = FALSE,
  lowPowerOverride = FALSE,
  timelimitOverride = TRUE,
  irgtt = FALSE
)

`nsim`	Number of datasets to simulate; accepts integer. Required.
`nsubjects`	Number of subjects per cluster; accepts either a scalar (implying equal cluster sizes for the two groups), a vector of length two (equal cluster sizes within arm), or a vector of length `sum(nclusters)` (unequal cluster sizes within arm). Required.
`nclusters`	Number of clusters per treatment group; accepts a single integer (if there are the same number of clusters in each arm) or a vector of 2 integers (if nsubjects differs between arms). If a vector of cluster sizes >2 is provided in `nsubjects`, `sum(nclusters)` must match the `nsubjects` vector length. Required.
`p1`	Expected probability of outcome in first group.
`p2`	Expected probability of outcome in second group.
`sigma_b_sq`	Between-cluster variance; if sigma_b_sq2 is not specified, between-cluster variances are assumed to be equal in the two arms. Accepts numeric. Required.
`sigma_b_sq2`	Between-cluster variance for clusters in second group. Only required if between-cluster variances differ between treatment arms.
`alpha`	Significance level; default = 0.05.
`method`	Data analysis method, either generalized linear mixed effects model (GLMM) or generalized estimating equations (GEE). Accepts c('glmm', 'gee'); default = 'glmm'. Required.
`quiet`	When set to FALSE, displays simulation progress and estimated completion time, default = TRUE.
`allSimData`	Option to output list of all simulated datasets; default = FALSE.
`seed`	Option to set the seed. Default is NA.
`nofit`	Option to skip model fitting and analysis and only return the simulated data. Default = `FALSE`.
`poorFitOverride`	Option to override `stop()` if more than 25% of fits fail to converge.
`lowPowerOverride`	Option to override `stop()` if the power is less than 0.5 after the first 50 simulations and every ten simulations thereafter. On function execution stop, the actual power is printed in the stop message. Default = FALSE. When TRUE, this check is ignored and the calculated power is returned regardless of value.
`timelimitOverride`	Logical. When FALSE, stops execution if the estimated completion time is more than 2 minutes. Defaults to TRUE.
`irgtt`	Logical. Default = FALSE. Is the experimental design an individually randomized group treatment trial? For details, see ?cps.irgtt.binary.

The data generating model for observation \mjseqnj in cluster \mjseqni is:

\mjsdeqn

y_ij \sim Bernoulli(\frace^p_1 + b_i1 + e^p_1 + b_i ) for the first group or arm, where \mjseqnb_i \sim N(0,\sigma_b^2), while for the second group, \mjsdeqny_ij \sim Bernoulli(\frace^p_2 + b_i1 + e^p_2 + b_i ) where \mjseqnb_i \sim N(0,\sigma_b_2^2); if \mjseqn\sigma_b_2^2 is not used, then the second group uses \mjseqnb_i \sim N(0,\sigma_b^2).

All random terms are generated independent of one another.

Non-convergent models are not included in the calculation of exact confidence intervals.

If nofit = F, a list with the following components:

Character string indicating total number of simulations, simulation type, and number of convergent models
Number of simulations
Data frame with columns "Power" (estimated statistical power), "lower.95.ci" (lower 95 "upper.95.ci" (upper 95 "Alpha" (probability of committing a Type I error or rejecting a true null), "Beta" (probability of committing a Type II error or failing to reject a false null). Note that non-convergent models are returned for review, but not included in this calculation.
Analytic method used for power estimation
Significance level
Vector containing user-defined cluster sizes
Vector containing user-defined number of clusters
Data frame reporting sigma_b_sq for each group
Vector containing user-supplied outcome probability and estimated odds ratio
Data frame containing three estimates of ICC
Data frame with columns: "Estimate" (Estimate of treatment effect for a given simulation), "Std.err" (Standard error for treatment effect estimate), "Test.statistic" (z-value (for GLMM) or Wald statistic (for GEE)), "p.value", "converge" (Did simulated model converge?)
If allSimData = TRUE, list of data frames, each containing: "y" (Simulated response value), "trt" (Indicator for treatment group), "clust" (Indicator for cluster)
List of warning messages produced by non-convergent models; Includes model number for cross-referencing against model.estimates
Logical vector reporting whether models converged.

If nofit = T, a data frame of the simulated data sets, containing:

"arm" (Indicator for treatment arm)
"cluster" (Indicator for cluster)
"y1" ... "yn" (Simulated response value for each of the nsim data sets).

This function has been verified against reference values from the NIH's GRT Sample Size Calculator, PASS11, CRTsize::n4prop, and clusterPower::cpa.binary.

Alexander R. Bogdan, Alexandria C. Sakrejda (acbro0@umass.edu), and Ken Kleinman (ken.kleinman@gmail.com) #'

Elridge, S., Ukoumunne, O. & Carlin, J. The Intra-Cluster Correlation Coefficient in Cluster Randomized Trials: A Review of Definitions. International Statistical Review (2009), 77, 3, 378-394. doi: 10.1111/j.1751-5823.2009.00092.x

Snjiders, T. & Bosker, R. Multilevel Analysis: an Introduction to Basic and Advanced Multilevel Modelling. London, 1999: Sage.

Wu S, Crespi CM, Wong WK. Comparison of Methods for Estimating Intraclass Correlation Coefficient for Binary Responses in Cancer Prevention Cluster Randomized Trials. Contemp Clin Trials. 2012; 33(5): 869-880. doi:10.1016/j.cct.2012.05.004 London: Arnold; 2000.

An intracluster correlation coefficient (ICC) for binary outcome data is neither a natural parameter of the data generating model nor a function of its parameters. Several methods for calculation have been suggested (Wu, Crespi, and Wong, 2012). We provide several versions of ICCs for comparison. These can be accessed in the bincalcICC() function.

# Estimate power for a trial with 10 clusters in each arm, 20 subjects in
# each cluster, with a probability of 0.8 in the first arm and 0.5 in the
# second arm, with a sigma_b_sq = 1 in the first arm sigma_b_sq = 1.2 in
# the second arm.

## Not run: 
binary.sim = cps.binary(nsim = 100, nsubjects = 20,
  nclusters = 10, p1 = 0.8,
  p2 = 0.5, sigma_b_sq = 1,
  sigma_b_sq2 = 1.2, alpha = 0.05,
  method = 'glmm', allSimData = FALSE)

## End(Not run)

# Estimate power for a trial just as above, except that in the first arm,
# the clusters have 10 subjects in 9 of the 10 clusters and 100 in the tenth
# cluster, while in the second arm all clusters have 20 subjects.

## Not run: 
binary.sim2 = cps.binary(nsim = 100,
  nsubjects = c(c(rep(10,9),100), rep(20,10)),
  nclusters = 10, p1 = 0.8,
  p2 = 0.5, sigma_b_sq = 1,
  sigma_b_sq2 = 1.2, alpha = 0.05,
  method = 'gee', allSimData = FALSE)

## End(Not run)