sim_reg_nested3: Function to simulate three level nested data

Description Usage Arguments Details See Also Examples

View source: R/sim_reg_func.r

Description

Takes simulation parameters as inputs and returns simulated data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
sim_reg_nested3(
  fixed,
  random,
  random3,
  fixed_param,
  random_param = list(),
  random_param3 = list(),
  cov_param,
  k,
  n,
  p,
  error_var,
  with_err_gen,
  arima = FALSE,
  data_str,
  cor_vars = NULL,
  fact_vars = list(NULL),
  unbal = list(level2 = FALSE, level3 = FALSE),
  unbal_design = list(level2 = NULL, level3 = NULL),
  lvl1_err_params = NULL,
  arima_mod = list(NULL),
  contrasts = NULL,
  homogeneity = TRUE,
  heterogeneity_var = NULL,
  cross_class_params = NULL,
  knot_args = list(NULL),
  ...
)

Arguments

fixed

One sided formula for fixed effects in the simulation. To suppress intercept add -1 to formula.

random

One sided formula for random effects in the simulation. Must be a subset of fixed.

random3

One sided formula for random effects at third level in the simulation. Must be a subset of fixed (and likely of random).

fixed_param

Fixed effect parameter values (i.e. beta weights). Must be same length as fixed.

random_param

A list of named elements that must contain:

  • random_var: variance of random parameters,

  • rand_gen: Name of simulation function for random effects.

Optional elements are:

  • ther: Theorectial mean and variance from rand_gen,

  • ther_sim: Simulate mean/variance for standardization purposes,

  • cor_vars: Correlation between random effects,

  • ...: Additional parameters needed for rand_gen function.

random_param3

A list of named elements that must contain:

  • random_var = variance of random parameters,

  • rand_gen = Name of simulation function for random effects.

Optional elements are:

  • ther: Theorectial mean and variance from rand_gen,

  • ther_sim: Simulate mean/variance for standardization purposes,

  • cor_vars: Correlation between random effects,

  • ...: Additional parameters needed for rand_gen function.

cov_param

List of arguments to pass to the continuous generating function, must be the same order as the variables specified in fixed. This list does not include intercept, time, factors, or interactions. Required arguments include:

  • dist_fun: This is a quoted R distribution function.

  • var_type: This is the level of variable to generate. Must be 'level1', 'level2', or 'level3'. Must be same order as fixed formula above.

Optional arguments to the distribution functions are in a nested list, see the examples or vignettes for example code.

k

Number of third level clusters.

n

Level two cluster sample size within each level three cluster.

p

Within cluster sample size within each level two cluster.

error_var

Scalar of error variance.

with_err_gen

Simulated within cluster error distribution. Must be a quoted 'r' distribution function.

arima

TRUE/FALSE flag indicating whether residuals should be correlated. If TRUE, must specify a valid model to pass to arima.sim via the arima_mod argument. See arima.sim for examples.

data_str

Type of data. Must be "cross" or "long".

cor_vars

A vector of correlations between variables.

fact_vars

A nested list of factor, categorical, or ordinal variable specification, each list must include:

  • numlevels = Number of levels for ordinal or factor variables.

  • var_type = Must be 'level1', 'level2', or 'level3'.

Optional arguments include:

  • replace

  • prob

  • value.labels

See also sample for use of these optional arguments.

unbal

A named TRUE/FALSE list specifying whether unbalanced simulation design is desired. The named elements must be: "level2" or "level3" representing unbalanced simulation for level two and three respectively. Default is FALSE, indicating balanced sample sizes at both levels.

unbal_design

When unbal = TRUE, this specifies the design for unbalanced simulation in one of two ways. It can represent the minimum and maximum sample size within a cluster via a named list. This will be drawn from a random uniform distribution with min and max specified. Secondly, the actual sample sizes within each cluster can be specified. This takes the form of a vector that must have the same length as the level two or three sample size. These are specified as a named list in which level two sample size is controlled via "level2" and level three sample size is controlled via "level3".

lvl1_err_params

Additional parameters passed as a list on to the level one error generating function

arima_mod

A list indicating the ARIMA model to pass to arima.sim. See arima.sim for examples.

contrasts

An optional list that specifies the contrasts to be used for factor variables (i.e. those variables with .f or .c). See contrasts for more detail.

homogeneity

Either TRUE (default) indicating homogeneity of variance assumption is assumed or FALSE to indicate desire to generate heterogeneity of variance.

heterogeneity_var

Variable name as a character string to use for heterogeneity of variance simulation.

cross_class_params

A list of named parameters when cross classified data structures are desired. Must include the following arguments:

  • num_ids: The number of cross classified clusters. These are in addition to the typical cluster ids

  • random_param: This argument is a list of arguments passed to sim_rand_eff. These must include:

    • random_var: The variance of the cross classified random effect

    • rand_gen: The random generating function used to generate the cross classified random effect.

    Optional elements are:

    • ther: Theorectial mean and variance from rand_gen,

    • ther_sim: Simulate mean/variance for standardization purposes,

    • cor_vars: Correlation between random effects,

    • ...: Additional parameters needed for rand_gen function.

knot_args

A nested list of named knot arguments. See sim_knot for more details. Arguments must include:

  • var

  • knot_locations

...

Not currently used.

Details

Simulates data for the linear mixed model, both cross sectional and longitudinal data. Returns a data frame with ID variables, fixed effects, and many other variables useful to help when running simulation studies.

See Also

sim_reg for a convenient wrapper for all data conditions.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#' # Three level example
fixed <- ~1 + time + diff + act + actClust + time:act
random <- ~1 + time + diff
random3 <- ~ 1 + time
fixed_param <- c(4, 2, 6, 2.3, 7, 0)
random_param <- list(random_var = c(7, 4, 2), rand_gen = 'rnorm')
random_param3 <- list(random_var = c(4, 2), rand_gen = 'rnorm')
cov_param <- list(dist_fun = c('rnorm', 'rnorm', 'rnorm'), 
     var_type = c("level1", "level2", "level3"),
     opts = list(list(mean = 0, sd = 1.5),
     list(mean = 0, sd = 4),
     list(mean = 0, sd = 2)))
k <- 10
n <- 15
p <- 10
error_var <- 4
with_err_gen <- 'rnorm'
data_str <- "long"
temp_three <- sim_reg(fixed, random, random3, fixed_param, random_param, 
   random_param3, cov_param, k,n, p, error_var, with_err_gen, 
   data_str = data_str)

simglm documentation built on July 8, 2020, 5:46 p.m.