Description Usage Arguments Details Value Author(s) Examples
This function uses Monte Carlo methods (simulations) to estimate power for cluster-randomized trials for integer-valued outcomes with two or more trial conditions. Users can modify a variety of parameters to suit the simulations to their desired experimental situation.
Users must specify the desired number of simulations, number of subjects per cluster, number of clusters per treatment arm, between-cluster variance, and two of the following three parameters: mean event rate per unit time in one group, the mean event rate per unit time in the second group, and/or the mean difference in event rates between groups. Default values are provided for significance level, analytic method, progress updates, and whether the simulated data sets are retained.
Note that if all units have the same observation time, you can use the mean count instead of the "mean event per unit time" in the preceding paragraph.
Users must specify the desired number of simulations, number of subjects per cluster, number of clusters per treatment arm, group probabilities, and the between-cluster variance. Significance level, analytic method, whether progress updates are displayed, poor/singular fit override, and whether or not to return the simulated data may also be specified.
This user-friendly function calls an internal function; the internal function
can be called
directly by the user to return the fitted models rather than the power
summaries (see ?cps.ma.count.internal
for details).
Users can spread the simulated data
generation and model fitting tasks across multiple cores using the
cores
argument. Users should expect that parallel computing may make
model fitting faster than using a single core for more complicated models.
For simpler models, users may prefer to use single thread computing
(cores
=1), as the processes involved in allocating memory and
copying data across cores also may take some time. For time-savings,
this function stops execution early if estimated power < 0.5 or more
than 25% of models produce a singular fit or non-convergence warning
message, unless poorFitOverride = TRUE
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | cps.ma.count(
nsim = 1000,
nsubjects = NULL,
narms = NULL,
nclusters = NULL,
counts = NULL,
family = "poisson",
analysis = "poisson",
negBinomSize = 1,
sigma_b_sq = NULL,
alpha = 0.05,
quiet = FALSE,
method = "glmm",
multi_p_method = "bonferroni",
allSimData = FALSE,
seed = NA,
cores = NA,
tdist = FALSE,
poorFitOverride = FALSE,
lowPowerOverride = FALSE,
timelimitOverride = TRUE,
return.all.models = FALSE,
nofit = FALSE,
opt = "NLOPT_LN_BOBYQA"
)
|
nsim |
Number of datasets to simulate; accepts integer. Required. |
nsubjects |
Number of subjects per cluster; accepts an
integer (implying equal cluster sizes in all arms) if |
narms |
Number of trial arms; accepts integer. Required. |
nclusters |
Number of clusters per treatment group; accepts a single integer
(if there are the same number of clusters in each arm) or a vector of integers
representing the number of clusters in each arm (if nsubjects differs between arms).
If a list of vectors of cluster sizes is provided in |
counts |
Mean event per unit time for each arm; accepts a scalar
(if all arms have the same event rate) or
a vector of length |
family |
Distribution from which responses are simulated. Accepts Poisson
( |
analysis |
Family used for data analysis; currently only applicable when |
negBinomSize |
Only used when generating simulated data from the negative binomial (family = 'neg.binom'), this is the target for number of successful trials, or the dispersion parameter (the shape parameter of the gamma mixing distribution). Must be positive and defaults to 1. Required when family = 'neg.binom'. |
sigma_b_sq |
Between-cluster variance for each arm; accepts a scalar
(if all arms have the same between-cluster variance) or a vector of length
|
alpha |
The level of significance of the test, the probability of a Type I error. Default = 0.05. |
quiet |
When set to |
method |
Data analysis method, either generalized linear mixed effects model
(GLMM) or generalized estimating equations (GEE). Accepts |
multi_p_method |
A string indicating the method to use for adjusting
p-values for multiple comparisons. Choose one of "holm", "hochberg",
"hommel", "bonferroni", "BH", "BY", "fdr", "none". The default is
"bonferroni". See |
allSimData |
Option to include a list of all simulated datasets in the output object.
Default = |
seed |
Option to set the seed. Default is NULL. |
cores |
Number of cores to be used for parallel computing. Accepts a string ("all"), NA (no parallel computing), or scalar value indicating the number of CPUs to use. Default = NA. |
tdist |
Logical value indicating whether cluster-level random effects
should be drawn from a \mjseqnt distribution rather than a normal distribution.
Default = |
poorFitOverride |
Option to override |
lowPowerOverride |
Option to override |
timelimitOverride |
Logical. When FALSE, stops execution if the estimated completion time is more than 2 minutes. Defaults to TRUE. |
return.all.models |
Logical; Returns all of the fitted models, the simulated data, the overall model comparisons, and the convergence report vector. This is equivalent to the output of cps.ma.count.internal(). See ?cps.ma.count.internal() for details. |
nofit |
Option to skip model fitting and analysis and return the simulated data.
Defaults to |
opt |
Optimizer for model fitting, from the package |
If family = 'poisson'
, the data generating model is:
\mjsdeqny_ijk \sim Poisson
(e^c_k + b_jk)
for observation \mjseqni, in cluster \mjseqnj, in treatment arm \mjseqnk, where \mjseqnb_jk\sim N(0,\sigma^2_b_k).
If family = 'neg.bin'
, the data generating model, using the
alternative parameterization of the negative binomial distribution
detailed in stats::rnbinom
, is:
y_ijk \sim NB
(\mu = e^c_k + b_jk, size
= 1)
for observation \mjseqni, in cluster \mjseqnj, in treatment arm \mjseqnk, where \mjseqnb_jk\sim N(0,\sigma^2_b_k).
Non-convergent models are not included in the calculation of exact confidence intervals.
For complicated models, we recommend using parallel processing with the cores="all"
argument.
For simpler models, users may prefer to use single thread computing
(cores
=1), as the processes involved in allocating memory and
copying data across cores also may take some time.
By default, this function stops execution early if estimated power < 0.5 or if more
than 25% of models produce a singular fit or non-convergence warning. In some cases, users
may want to ignore singularity warnings (see ?isSingular
) by setting poorFitOverride = TRUE
.
A list with the following components:
Data frame with columns "power" (Estimated statistical power), "lower.95.ci" (Lower 95% confidence interval bound), "upper.95.ci" (Upper 95% confidence interval bound).
Data frame with columns corresponding to each arm with descriptive suffixes as follows: ".Estimate" (Estimate of treatment effect for a given simulation), "Std.Err" (Standard error for treatment effect estimate), ".zval" (for GLMM) | ".wald" (for GEE), and ".pval" (the p-value estimate).
Table of F-test (when method="glmm") or chi^2 (when method="gee") significance test results.
Summary overall power of treatment model compared to the null model.
Produced when allSimData==TRUE. List of nsim
data frames, each containing:
"y" (simulated response value),
"trt" (indicator for treatment group or arm), and
"clust" (indicator for cluster).
Character string containing the percent
of nsim
in which the glmm fit was singular or failed to converge,
produced only when method = "glmm" & allSimData = FALSE.
Vector of length nsim
denoting
whether or not a simulation glmm fit triggered a "singular fit" or
"non-convergence" error, produced only when method = "glmm" &
allSimData=TRUE.
If nofit = T
, a data frame of the simulated data sets, containing:
"arm" (Indicator for treatment arm)
"cluster" (Indicator for cluster)
"y1" ... "yn" (Simulated response value for each of the nsim
data sets).
Alexandria C. Sakrejda (acbro0@umass.edu)
Alexander R. Bogdan
Ken Kleinman (ken.kleinman@gmail.com)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | # For a 3-arm trial with 4, 4, and 5 clusters in each arm, respectively,
# specify the number of subjects in each cluster with 3 vectors in a list,
# each vector representing a study arm. For each cluster, in no particular
# order, denote the number of subjects. In this example, the first arm
# contains 150, 200, 50, and 100 subjects in each of the 4 clusters. The second
# arm contains 50, 150, 210, and 100 subjects in each of 4 clusters, while
# the third arm contains 70, 200, 150, 50, and 100 subjects in each of 5
# clusters. The expected outcomes for each arm are 10, 55, and 65, and
# the sigma_b_sq values are 1, 1, and 2, respectively. Assuming
# seed = 123, the overall power for this trial should be 0.81.
## Not run:
nsubjects.example <- list(c(150, 200, 50, 100), c(50, 150, 210, 100),
c(70, 200, 150, 50, 100))
counts.example <- c(10, 55, 65)
sigma_b_sq.example <- c(1, 1, 2)
count.ma.rct.unbal <- cps.ma.count(nsim = 100,
narms = 3,
nsubjects = nsubjects.example,
counts = counts.example,
sigma_b_sq = sigma_b_sq.example,
alpha = 0.05, seed = 123)
## End(Not run)
# For a different trial with 4 arms, each arm has 4 clusters which
# each contain 100 subjects. Expected counts for each arm are 30
# for the first arm, 35 for the second, 70 for the third, and 40
# for the fourth. Similarly, sigma_b_sq for each arm are 1
# for the first arm, 1.2 for the second, 1 for the third, and 0.9
# for the fourth. Assuming seed = 123, the overall power for this
# trial should be 0.84
## Not run:
count.ma.rct.bal <- cps.ma.count(nsim = 10, nsubjects = 100, narms = 4,
nclusters = 25, counts = c(30, 35, 70, 40),
sigma_b_sq = c(1, 1.2, 1, 0.9), seed = 123)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.