| Design | R Documentation |
A GLMM Design
A GLMM Design
An R6 class representing a GLMM and study design
For the generalised linear mixed model
Y \sim F(μ,σ)
μ = h^-1(Xβ + Zγ)
γ \sim MVN(0,D)
where h is the link function. A Design in comprised of a MeanFunction object, which defines the family F, link function h, and fixed effects design matrix X, and a Covariance object, which defines Z and D. The class provides methods for analysis and simulation with these models.
This class provides methods for: data simulation (sim_data() and fitted()), model fitting using Markov Chain
Monte Carlo Maximum Likelihood (MCML) methods (MCML()), design analysis via simulation including power (analysis()),
deletion diagnostics (dfbeta()), and permutation tests including p-values and confidence intervals (permutation()).
The class by default calculates the covariance matrix of the observations as:
Σ = W^{-1} + ZDZ^T
where W is a diagonal matrix with the WLS iterated weights for each observation equal to, for individual i φ a_i v(μ_i)[h'(μ_i)]^2 (see Table 2.1 in McCullagh and Nelder (1989) ISBN:9780412317606). For very large designs, this can be disabled as the memory requirements can be prohibitive.
Calls the respective print methods of the linked covariance and mean function objects.
The matrices X and Z both have n rows, where n is the number of observations in the model/design.
Number of clusters Returns a data frame describing the number of independent clusters or groups at each level in the design. For example, if there were cluster-periods nested in clusters, then the top level would be clusters, and the second level would be cluster periods.
analysis
The analysis function conducts a detailed design analysis using the analysis
model specified by the object. Data are simulated either using the same
data generating process, or using a different Design object specified by
the user to allow for model misspecification. On each iteration the model
is estimated with the simulated data y using either the MCML function or
approximate parameters and standard errors using generalised least squares.
MCML is an exact maximum likelihood algorithm, and can be slow, so results
from previous simulations are saved in the design object and can be recalled
later. Deletion diagnostics are also calculated to calculate influential parts of
the design, although these are typically not useful for balanced
experimental designs.
The function returns an glmmr.sim object, which estimates and summarises:
Model fitting and simulation diagnostics Convergence failure percent and coverage. Maximum likelihood estimators for GLMMs can fail to converge or reach the MLE. GLMMs can also be susceptible to small sample biases where the relevant sample size is at the level of clustering and correlation.
Error rates Type 2 (power), Type S (significance), and Type M (magnitude) errors are reported.
p-value and confidence interval width distributions
Deletion diagnostics For unbalanced designs and under model misspecifications, certain parts of the design may have more influence than others over the estimate of interest, or have a larger than desired effect. A summary of the DFBETA diagnostic is provided.
Approximate power Calculates the approximate power of the design using the square root of the relevant element of the GLS variance matrix:
(X^TΣ^{-1}X)^{-1}
Note that this is equivalent to using the "design effect" for many models.
MCMCML Fits generalised linear mixed models using one of three algorithms: Markov Chain Newton Raphson (MCNR), Markov Chain Expectation Maximisation (MCEM), or Maximum simulated likelihood (MSL). All the algorithms are described by McCullagh (1997). For each iteration of the algorithm the unobserved random effect terms (γ) are simulated using Markov Chain Monte Carlo (MCMC) methods (we use Hamiltonian Monte Carlo through Stan), and then these values are conditioned on in the subsequent steps to estimate the covariance parameters and the mean function parameters (β). For all the algorithms, the covariance parameter estimates are updated using an expectation maximisation step. For the mean function parameters you can either use a Newton Raphson step (MCNR) or an expectation maximisation step (MCEM). A simulated likelihood step can be added at the end of either MCNR or MCEM, which uses an importance sampling technique to refine the parameter estimates.
The accuracy of the algorithm depends on the user specified tolerance. For higher levels of tolerance, larger numbers of MCMC samples are likely need to sufficiently reduce Monte Carlo error.
The function also offers different methods of obtaining standard errors. First, one can generate
estimates from the estimated Hessian matrix (se.method = 'lik'). Second, there are robust standard
errors using a sandwich estimator based on White (1982) (se.method = 'robust').
Third, there are use approximate generalised least squares estimates based on the maximum likelihood
estimates of the covariance
parameters (se.method = 'approx'), or use a permutation test approach (se.method = 'perm').
Note that the permutation test can be accessed separately with the function permutation_test().
There are several options that can be specified to the function using the options argument.
The options should be provided as a list, e.g. options = list(method="mcnr"). The possible options are:
b_se_only TRUE (calculate standard errors of the mean function parameters only) or FALSE (calculate
all standard errors), default it FALSE.
use_cmdstanr TRUE (uses cmdstanr for the MCMC sampling, requires cmdstanr), or FALSE (uses rstan). Default is FALSE.
skip_cov_optim TRUE (skips the covariance parameter estimation step, and uses the values covariance$parameters), or
FALSE (run the whole algorithm)], default is FALSE
sim_lik_step TRUE (conduct a simulated likelihood step at the end of the algorithm), or FALSE (does
not do this step), defaults to FALSE.
no_warnings TRUE (do not report any warnings) or FALSE (report warnings), default to FALSE
perm_type Either cov (use weighted test statistic in permutation test) or unw (use unweighted
test statistic), defaults to cov. See permutation_test().
perm_iter Number of iterations for the permutation test, default is 100.
perm_parallel TRUE (run permuation test in parallel) or FALSE (runs on a single thread), default to TRUE
warmup_iter Number of warmup iterations on each iteration for the MCMC sampler, default is 500
perm_ci_steps Number of steps for the confidence interval search procedure if using the permutation
test, default is 1000. See permutation_test().
fd_tol The tolerance of the first difference method to estimate the Hessian and Gradient, default
is 1e-4.
Permutation tests
If the user provided a re-randomisation function to the linked mean function object (see MeanFunction),
then a permuation test can be conducted. A new random assignment is generated on each iteration of the permutation test.
The test statistic can be either a quasi-score statistic, weighting the observations using the covariance matrix (type="cov"),
or an unweighted statistic that weights each observation in each cluster equally (type="unw"). The 1-alpha%
confidence interval limits are estimated using an efficient iterative stochastic search procedure. On each step of the algorithm
a single permuation and test statistic is generated, and the current estimate of the confidence interval limit either
increased or decreased depedning on its value. The procedure converges in probability to the true limits, see Watson et al (2021)
and Garthwaite (1996).
MCMC
Draws samples from the posterior distribution of the model parameters using Stan. Priors are specified using the priors argument.
Currently, only Gaussian (or half-Gaussian for covariance parameters) prior distributions are supported. The argument priors
accepts a list with three or four elements, prior_b_mean, prior_b_sd which are vectors specifying the prior mean and
standard deviation for the mean function parameters β in the model; prior_g_sd specifying the prior standard deviation
for the half-Gaussian prior for the covariance parameters, and optionally sigma_sd for the half-Gaussian prior for the scale
terms in models that have a scale parameter (Gaussian and Gamma currently). By default the function uses rstan, however the function
can optionally call cmdstanr instead if it is installed, using the option use_cmdstanr=TRUE. For further details on the use and
arguments that can be used, see sampling.
Bayesian design analysis
The Bayesian design analysis conducts multiple iterations of simulating data from a GLMM and then fitting a GLMM to analyse
the properties of the posterior distribution and model calibration to inform study design. Data are either simulated from the same
model as the analysis model, i.e. there is correct model specification (nothing is specified for the option sim_design), or data
are simulated from a different model (a Design object passed to the sim_design argument of this function) and then fit using
the Design object from which this function was called, i.e. with model misspecification. The function analyses four related statistics
summarising the design. First, the expected posterior variance; second, the variance of the posterior variance; third, the probability the
parameter of interest will be greater than some threshold; and fourth, simulation-based calibration (see Talts et al, 2021).
Model fitting is done using Stan's sampler (see sampling and the MCMC() function in this class).
covarianceA Covariance object defining the random effects covariance.
mean_functionA MeanFunction object, defining the mean function for the model, including the data and covariate design matrix X.
exp_conditionA vector indicting the unique experimental conditions for each observation, see Details.
SigmaThe overall covariance matrix for the observations. Calculated and updated automatically as W^{-1} + ZDZ^T where W is an n x n diagonal matrix with elements on the diagonal equal to the GLM iterated weights. See Details.
var_parScale parameter required for some distributions (Gaussian, Gamma, Beta).
fitted()Return predicted values based on the currently stored parameter values in mean_function
Design$fitted(type = "link")
typeOne of either "link" for values on the scale of the link function, or "response"
for values on the scale of the response
A Matrix class object containing the predicted values
new()Create a new Design object
Design$new( covariance, mean.function, var_par = NULL, verbose = TRUE, skip.sigma = FALSE )
covarianceEither a Covariance object, or an equivalent list of arguments
that can be passed to Covariance to create a new object.
mean.functionEither a MeanFunction object, or an equivalent list of arguments
that can be passed to MeanFunction to create a new object.
var_parScale parameter required for some distributions, including Gaussian. Default is NULL.
verboseLogical indicating whether to provide detailed output
skip.sigmaLogical indicating whether to skip the creating of the covariance matrix Sigma. For very large designs with thousands of observations or more, the covariance matrix will be too big to fit in memory, so this option will prevent sigma being created.
A new Design class object
#create a data frame describing a cross-sectional parallel cluster
#randomised trial
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
#alternatively we can pass the data directly to Design
#here we will specify a cohort study
df <- nelder(~ind(20) > t(6))
df$int <- 0
df[df$t > 3, 'int'] <- 1
des <- Design$new(
covariance = list(
data=df,
formula = ~ (1|ar1(t)*gr(ind)),
parameters = c(0.8,1)),
mean.function = list(
formula = ~int + factor(t),
data=df,
parameters = rep(0,7),
family = poisson()))
#an example of a spatial grid with two time points
df <- nelder(~ (x(10)*y(10))*t(2))
spt_design <- Design$new(covariance = list(data=df,
formula = ~(1|fexp(x,y)*ar1(t)),
parameters =c(0.2,0.1,0.8)),
mean.function = list(data=df,
formula = ~ 1,
parameters = c(0.5),
family=poisson()))
print()Print method for Design class
Design$print()
...ignored
n()Returns the number of observations in the model
Design$n(...)
...ignored
n_cluster()Returns the number of clusters at each level
Design$n_cluster()
...ignored
A data frame with the level, number of clusters, and variables describing each level.
df <- nelder(~(cl(10)*t(5)) > ind(10)) df$int <- 0 df[df$cl > 5, 'int'] <- 1 mf1 <- MeanFunction$new( formula = ~ factor(t) + int - 1, data=df, parameters = c(rep(0,5),0.6), family = gaussian() ) cov1 <- Covariance$new( data = df, formula = ~ (1|gr(cl)) + (1|gr(cl*t)), parameters = c(0.25,0.1) ) des <- Design$new( covariance = cov1, mean.function = mf1, var_par = 1 ) des$n_cluster() ## returns two levels of 10 and 50
analysis()Run a design analysis of the model via simulation
Design$analysis( type, iter, par, alpha = 0.05, sim_design, parallel = TRUE, verbose = TRUE, options = list(), ... )
typeOne of either sim_data (recalls saved data from a previous
call to this function), sim (full simulation using MCML), or sim_approx (
uses GLS to approximate the MLE and standard errors)
iterInteger. The number of iterations of the simulation to run
parInteger. The parameter of interest for which design analysis statistics should be calculated. Refers to the column of X.
alphaNumeric. The type I error rate.
sim_designOptional. A different Design object that will be used to
simulate data to allow for model misspecification.
parallelLogical indicating whether to run the simulations in parallel
verboseLogical indicating whether to report detailed output. Defaults to TRUE.
optionsOptional. A named list to pass to the options argument of MCML
...Additional arguments passed to MCML, see below.
A glmmr.sim object containing the estimates from all the simulations, including
standard errors, deletion diagnostic statistics, and details about the simulation.
\dontrun{
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
# analysis using MCML mcem algorithm
test1 <- des$analysis(type="sim",
iter=100,
par=6,
parallel = FALSE,
verbose = FALSE,
method = "mcem",
m = 100)
#an analysis using the permutation test option and MCNR
test2 <- des$analysis(type="sim",
iter=100,
se.method="perm",
par=6,
parallel = FALSE,
verbose = FALSE,
options = list(
perm_type="unw", perm_iter=100,
perm_parallel=FALSE,perm_ci_steps=1000)
method = "mcnr",
m = 100)
#returning previously saved sim data
test3 <- des$analysis(type="sim_data")
#to test model misspecification we can simulate from a different model
cov2 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)
)
des2 <- Design$new(
covariance = cov2,
mean.function = mf1,
var_par = 1
)
test4 <- des$analysis(type="sim",
iter=100,
sim_design = des2,
par=6,
parallel = FALSE,
verbose = FALSE,
method = "mcem",
m = 100)
}
power()Approximate power of the design using the GLS variance
Design$power(par, value, alpha = 0.05)
parInteger indicating which parameter of the design the power should be calculated for. Refers to the order of parameters and column of X
valueNumeric specifying the value of the parameter to calculate the power at
alphaNumeric between zero and one indicating the type I error rate. Default of 0.05.
A value between zero and one indicating the approximate power of the design.
df <- nelder(~(cl(10)*t(5)) > ind(10)) df$int <- 0 df[df$cl > 5, 'int'] <- 1 mf1 <- MeanFunction$new( formula = ~ factor(t) + int - 1, data=df, parameters = c(rep(0,5),0.6), family = gaussian() ) cov1 <- Covariance$new( data = df, formula = ~ (1|gr(cl)) + (1|gr(cl*t)), parameters = c(0.25,0.1) ) des <- Design$new( covariance = cov1, mean.function = mf1, var_par = 1 ) des$power(6,0.5) #power of 0.79
subset_rows()Subsets the design keeping specified observations only
Given a vector of row indices, the corresponding rows will be kept and the other rows will be removed from the mean function and covariance
Design$subset_rows(index)
indexInteger or vector integers listing the rows to keep
The function updates the object and nothing is returned
#generate a stepped wedge design and remove the first sequence des <- stepped_wedge(8,10,icc=0.05) ids_to_keep <- which(des$mean_function$data$J!=1) des$subset_rows(ids_to_keep)
subset_cols()Subsets the columns of the design
Removes the specified columns from the linked mean function object's X matrix.
Design$subset_cols(index)
indexInteger or vector of integers specifying the indexes of the columns to keep
The function updates the object and nothing is returned
#generate a stepped wedge design and remove first and last time periods des <- stepped_wedge(8,10,icc=0.05) des$subset_cols(c(2:8,10))
plot()Generates a plot of the design
Generates a 'bubble' plot of the design with respect to two or three variables in which the size of the points at each location are scaled by the number of observations at that location. For example, for a cluster randomised trial the user might specify time period on the x-axis and cluster ID on the y-axis. For a geospatial sampling design the x and y axes might represent spatial dimensions.
Design$plot(x, y, z = NULL, treat)
xString naming a column in the data frame in the linked covariance object (self$covariance$data) for the x-axis
yString naming a column in the data frame in the linked covariance object (self$covariance$data) for the y-axis
zOptional. String naming a column in the data frame in the linked covariance object (self$covariance$data) for a 'third axis' used for faceting
treatString naming a column in the data frame in the linked mean function object (self$mean_function$data) that identifies the treatment status of the observations at each location, used to set the colour of the points in the plot
A ggplot2 plot
\dontrun{
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
des$plot(x="cl",y="t",treat="int")
}
sim_data()Generates a realisation of the design
Generates a single vector of outcome data based upon the specified GLMM design
Design$sim_data(type = "y")
typeEither 'y' to return just the outcome data, or 'data' to return a data frame with the simulated outcome data alongside the model data
Either a vector or a data frame
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
check()Checks for any changes in linked objects and updates.
Checks for any changes in any object and updates all linked objects if any are detected. Generally called automatically.
Design$check(verbose = TRUE)
verboseLogical indicating whether to report if any updates are made, defaults to TRUE
Linked objects are updated by nothing is returned
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
des$check() #does nothing
des$covariance$parameters <- c(0.1,0.9)
des$check() #updates
des$mean_function$parameters <- c(rnorm(5),0.1)
des$check() #updates
MCML()Markov Chain Monte Carlo Maximum Likelihood model fitting
Design$MCML( y, start, se.method = "lik", method = "mcnr", permutation.par, verbose = TRUE, tol = 0.01, m = 100, max.iter = 30, options = list() )
yA numeric vector of outcome data
startOptional. A numeric vector indicating starting values for the MCML algorithm iterations. If this is not specified then the parameter values stored in the linked mean function object will be used.
se.methodOne of either 'lik', 'approx', 'perm', or 'none', see Details.
methodThe MCML algorithm to use, either mcem or mcnr, see Details. Default is mcem.
permutation.parOptional. Integer specifing the index of the parameter if permutation tests are being used.
verboseLogical indicating whether to provide detailed output, defaults to TRUE.
tolNumeric value, tolerance of the MCML algorithm, the maximum difference in parameter estimates between iterations at which to stop the algorithm.
mInteger. The number of MCMC draws of the random effects on each iteration.
max.iterInteger. The maximum number of iterations of the MCML algorithm.
optionsAn optional list providing options to the algorithm, see Details.
A mcml object
\dontrun{
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
# fits the models using MCEM but does not estimate standard errors
fit1 <- des$MCML(y = ysim,
se.method = "none")
#fits the models using MCNR but does not estimate standard errors
fit2 <- des$MCML(y = ysim,
se.method = "none",
method="mcnr")
#fits the models and uses permutation tests for parameter of interest
fit3 <- des$MCML(y = ysim,
se.method = "perm",
permutation.par = 6,
options = list(
perm_type="unw",
perm_iter=1000,
perm_parallel=FALSE,
perm_ci_steps=1000
))
#adds a simulated likelihood step after the MCEM algorithm
fit4 <- des$MCML(y = des$sim_data(),
se.method = "none",
options = list(
sim_lik_step=TRUE
))
}
dfbeta()Calculates DFBETA deletion diagnostic values
Calculates the DFBETA deletion diagnostic statistic for each observation for the specified parameter.
Design$dfbeta(y, par)
yNumeric vector of outcomes data
parInteger indicating which parameter, as a column of X, to report the deletion diagnostics for.
A vector of length self$n() with the value of DFBETA for each observation for the specified parameter
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
des$dfbeta(ysim,6)
permutation_test()Conducts a permuation test
Estimates p-values and confidence intervals using a permutation test
Design$permutation_test( y, permutation.par, start, iter = 1000, nsteps = 1000, type = "cov", parallel = TRUE, verbose = TRUE )
yNumeric vector of outcome data
permutation.parInteger indicator which parameter to conduct a permutation test for. Refers to a column of the X matrix.
startValue of the parameter. Used both as a starting value for the algorithms and as a best estimate for the confidence interval search procedure.
iterInteger. Number of iterations of the permuation test to conduct
nstepsInteger. Number of steps of the confidence interval search procedure
typeEither cov for a test statistic weighted by the covariance matrix, or
unw for an unweighted test statistic. See Details.
parallelLogical indicating whether to run the tests in parallel
verboseLogical indicating whether to report detailed output
A list with the estimated p-value and the estimated lower and upper 95% confidence interval
\dontrun{
df <- nelder(~(cl(6)*t(5)) > ind(5))
df$int <- 0
df[df$cl > 3, 'int'] <- 1
#generate function that produces random allocations
treatf <- function(){
tr <- sample(rep(c(0,1),each=3),6,replace = FALSE)
rep(tr,each=25)
}
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family =gaussian(),
treat_var = "int",
random_function = treatf)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)),
parameters = c(0.25))
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1)
#run MCML to get parameter estimate:
fit1 <- des$MCML(y = ysim,
se.method = "none")
perm1 <- des$permutation_test(
y=ysim,
permutation.par=6,
start = fit1$coefficients$est[6],
type="unw",
iter = 1000,
nsteps = 1000)
}
MCMC()Fit the GLMM using MCMC
Fit the GLMM using MCMC. The function calls a Stan program to draw posterior samples.
Design$MCMC( y, priors, warmup_iter = 1000, sampling_iter = 1000, chains = 4, use_cmdstanr = FALSE, parallel = TRUE, ... )
yNumeric vector providing the outcome data
priorsA named list specifying the prior mean and standard deviation for the Gaussian prior distributions, see Details.
warmup_iterA positive integer specifying the number of warmup iterations for each MCMC chain
sampling_iterA positive integer specifying the number of sampling iterations for each MCMC chain
chainsA positive integer specifying the number of MCMC chains to run
use_cmdstanrLogical indicating whether to use cmdstanr, the default is FALSE, which will use rstan
parallelLogical indicating whether to run the chains in parallel
...Additional arguments to pass to sampling.
Either an S4 stanfit object returned by sampling, or a cmdstanr environment, depending on the sampler used.
\dontrun{
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
# fits the model with default priors
fit1 <- des$MCMC(y=ysim)
#specify priors
fit2 <- des$MCMC(y=ysim,
priors = list(
prior_b_mean = rep(0,6),
prior_b_sd = c(rep(3,5),1),
prior_g_sd = rep(1,2)))
}
analysis_bayesian()Bayesian design analysis
Runs a Bayesian design analysis using simulated data
Design$analysis_bayesian( iter, par, priors, threshold = 0, sim_design, priors_sim, warmup_iter = 200, sampling_iter = 200, chains = 1, parallel = TRUE, verbose = TRUE, use_cmdstanr = FALSE, ... )
iterA positive integer specifying the number of simulation iterations to run
parA positive interger specifying the index of the parameter in the mean function to analyse, refers specifically to the column of X
priorsA named list specifying the priors, see Details in the MCMC() function in this class
thresholdA number specifying the threshold. The probability that the parameter of interest is greater than this threshold will be calculated.
sim_designOptional. A different Design object that should be used to simulate the data for the simulation.
priors_simOptional. A named list of the same structure as priors the specifies the priors from which to simulate data, if they are different
to the priors for model fitting.
warmup_iterA positive integer specifying the number of warmup iterations for each MCMC chain when fitting the model on each simulation iteration.
sampling_iterA positive integer specifying the number of sampling iterations for each MCMC chain when fitting the model on each simulation iteration.
chainsA positive integer specifying the number of MCMC chains when fitting the model on each simulation iteration.
parallelLogical indicating whether to run the MCMC chains in parallel
verboseLogical indicating whether to provide detailed output of the progress of simulations.
use_cmdstanrLogical indicating whether to use cmdstanr instead of rstan, the default is FALSE (i.e. use rstan)
...Additional options to pass to rstan::sampling or cmdstanr::sample
A glmmr.sim object containing the samples of posterior variance and relevant probabilities to summarise the analysis
\dontrun{
#' df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
test <- des$analysis_bayesian(iter=10,
par=6,
warmup_iter = 200,
sampling_iter = 500,
parallel = FALSE,
verbose=FALSE)
#to examine misspecification we can sample from a different model
des2 <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)),
parameters = c(0.25)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
test2 <- des$analysis_bayesian(iter=10,
par=6,
sim_design=des2,
warmup_iter = 200,
sampling_iter = 500,
parallel = FALSE,
verbose=FALSE)
}
clone()The objects of this class are cloneable with this method.
Design$clone(deep = FALSE)
deepWhether to make a deep clone.
Braun and Feng McCullagh Stan McCullagh and Nelder Approx GLMMs paper Watson confidence interval
Watson et al. Arxiv Braun and Feng Gail Garthwaite
SBC paper (Talts) APV paper
nelder, MeanFunction, Covariance
print.glmmr.sim, MCML
sampling, stan
## ------------------------------------------------
## Method `Design$new`
## ------------------------------------------------
#create a data frame describing a cross-sectional parallel cluster
#randomised trial
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
#alternatively we can pass the data directly to Design
#here we will specify a cohort study
df <- nelder(~ind(20) > t(6))
df$int <- 0
df[df$t > 3, 'int'] <- 1
des <- Design$new(
covariance = list(
data=df,
formula = ~ (1|ar1(t)*gr(ind)),
parameters = c(0.8,1)),
mean.function = list(
formula = ~int + factor(t),
data=df,
parameters = rep(0,7),
family = poisson()))
#an example of a spatial grid with two time points
df <- nelder(~ (x(10)*y(10))*t(2))
spt_design <- Design$new(covariance = list(data=df,
formula = ~(1|fexp(x,y)*ar1(t)),
parameters =c(0.2,0.1,0.8)),
mean.function = list(data=df,
formula = ~ 1,
parameters = c(0.5),
family=poisson()))
## ------------------------------------------------
## Method `Design$n_cluster`
## ------------------------------------------------
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
des$n_cluster() ## returns two levels of 10 and 50
## ------------------------------------------------
## Method `Design$analysis`
## ------------------------------------------------
## Not run:
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
# analysis using MCML mcem algorithm
test1 <- des$analysis(type="sim",
iter=100,
par=6,
parallel = FALSE,
verbose = FALSE,
method = "mcem",
m = 100)
#an analysis using the permutation test option and MCNR
test2 <- des$analysis(type="sim",
iter=100,
se.method="perm",
par=6,
parallel = FALSE,
verbose = FALSE,
options = list(
perm_type="unw", perm_iter=100,
perm_parallel=FALSE,perm_ci_steps=1000)
method = "mcnr",
m = 100)
#returning previously saved sim data
test3 <- des$analysis(type="sim_data")
#to test model misspecification we can simulate from a different model
cov2 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)
)
des2 <- Design$new(
covariance = cov2,
mean.function = mf1,
var_par = 1
)
test4 <- des$analysis(type="sim",
iter=100,
sim_design = des2,
par=6,
parallel = FALSE,
verbose = FALSE,
method = "mcem",
m = 100)
## End(Not run)
## ------------------------------------------------
## Method `Design$power`
## ------------------------------------------------
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = gaussian()
)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)) + (1|gr(cl*t)),
parameters = c(0.25,0.1)
)
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1
)
des$power(6,0.5) #power of 0.79
## ------------------------------------------------
## Method `Design$subset_rows`
## ------------------------------------------------
#generate a stepped wedge design and remove the first sequence
des <- stepped_wedge(8,10,icc=0.05)
ids_to_keep <- which(des$mean_function$data$J!=1)
des$subset_rows(ids_to_keep)
## ------------------------------------------------
## Method `Design$subset_cols`
## ------------------------------------------------
#generate a stepped wedge design and remove first and last time periods
des <- stepped_wedge(8,10,icc=0.05)
des$subset_cols(c(2:8,10))
## ------------------------------------------------
## Method `Design$plot`
## ------------------------------------------------
## Not run:
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
des$plot(x="cl",y="t",treat="int")
## End(Not run)
## ------------------------------------------------
## Method `Design$sim_data`
## ------------------------------------------------
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
## ------------------------------------------------
## Method `Design$check`
## ------------------------------------------------
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
des$check() #does nothing
des$covariance$parameters <- c(0.1,0.9)
des$check() #updates
des$mean_function$parameters <- c(rnorm(5),0.1)
des$check() #updates
## ------------------------------------------------
## Method `Design$MCML`
## ------------------------------------------------
## Not run:
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
# fits the models using MCEM but does not estimate standard errors
fit1 <- des$MCML(y = ysim,
se.method = "none")
#fits the models using MCNR but does not estimate standard errors
fit2 <- des$MCML(y = ysim,
se.method = "none",
method="mcnr")
#fits the models and uses permutation tests for parameter of interest
fit3 <- des$MCML(y = ysim,
se.method = "perm",
permutation.par = 6,
options = list(
perm_type="unw",
perm_iter=1000,
perm_parallel=FALSE,
perm_ci_steps=1000
))
#adds a simulated likelihood step after the MCEM algorithm
fit4 <- des$MCML(y = des$sim_data(),
se.method = "none",
options = list(
sim_lik_step=TRUE
))
## End(Not run)
## ------------------------------------------------
## Method `Design$dfbeta`
## ------------------------------------------------
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
des$dfbeta(ysim,6)
## ------------------------------------------------
## Method `Design$permutation_test`
## ------------------------------------------------
## Not run:
df <- nelder(~(cl(6)*t(5)) > ind(5))
df$int <- 0
df[df$cl > 3, 'int'] <- 1
#generate function that produces random allocations
treatf <- function(){
tr <- sample(rep(c(0,1),each=3),6,replace = FALSE)
rep(tr,each=25)
}
mf1 <- MeanFunction$new(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family =gaussian(),
treat_var = "int",
random_function = treatf)
cov1 <- Covariance$new(
data = df,
formula = ~ (1|gr(cl)),
parameters = c(0.25))
des <- Design$new(
covariance = cov1,
mean.function = mf1,
var_par = 1)
#run MCML to get parameter estimate:
fit1 <- des$MCML(y = ysim,
se.method = "none")
perm1 <- des$permutation_test(
y=ysim,
permutation.par=6,
start = fit1$coefficients$est[6],
type="unw",
iter = 1000,
nsteps = 1000)
## End(Not run)
## ------------------------------------------------
## Method `Design$MCMC`
## ------------------------------------------------
## Not run:
df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
ysim <- des$sim_data()
# fits the model with default priors
fit1 <- des$MCMC(y=ysim)
#specify priors
fit2 <- des$MCMC(y=ysim,
priors = list(
prior_b_mean = rep(0,6),
prior_b_sd = c(rep(3,5),1),
prior_g_sd = rep(1,2)))
## End(Not run)
## ------------------------------------------------
## Method `Design$analysis_bayesian`
## ------------------------------------------------
## Not run:
#' df <- nelder(~(cl(10)*t(5)) > ind(10))
df$int <- 0
df[df$cl > 5, 'int'] <- 1
des <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)*ar1(t)),
parameters = c(0.25,0.8)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
test <- des$analysis_bayesian(iter=10,
par=6,
warmup_iter = 200,
sampling_iter = 500,
parallel = FALSE,
verbose=FALSE)
#to examine misspecification we can sample from a different model
des2 <- Design$new(
covariance = list(
data = df,
formula = ~ (1|gr(cl)),
parameters = c(0.25)),
mean.function = list(
formula = ~ factor(t) + int - 1,
data=df,
parameters = c(rep(0,5),0.6),
family = binomial())
)
test2 <- des$analysis_bayesian(iter=10,
par=6,
sim_design=des2,
warmup_iter = 200,
sampling_iter = 500,
parallel = FALSE,
verbose=FALSE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.