simulate_r_database: Simulate correlation databases of primary studies

View source: R/simulate_r.R

simulate_r_databaseR Documentation

Simulate correlation databases of primary studies

Description

The simulate_r_database function generates databases of psychometric correlation data from sample-size parameters, correlation parameters, reliability parameters, and selection-ratio parameters. The output database can be provided in either a long format or a wide format. If composite variables are to be formed, parameters can also be defined for the weights used to form the composites as well as the selection ratios applied to the composites. This function will return a database of statistics as well as a database of parameters - the parameter database contains the actual study parameters for each simulated sample (without sampleing error) to allow comparisons between meta-analytic results computed from the statistics and the actual means and variances of parameters. The merge_simdat_r function can be used to merge multiple simulated databases and the sparsify_simdat_r function can be used to randomly delete artifact information (a procedure commonly done in simulations of artifact-distribution methods).

Usage

simulate_r_database(
  k,
  n_params,
  rho_params,
  mu_params = 0,
  sigma_params = 1,
  rel_params = 1,
  sr_params = 1,
  k_items_params = 1,
  wt_params = NULL,
  allow_neg_wt = FALSE,
  sr_composite_params = NULL,
  var_names = NULL,
  composite_names = NULL,
  n_as_ni = FALSE,
  show_applicant = FALSE,
  keep_vars = NULL,
  decimals = 2,
  format = "long",
  max_iter = 100,
  ...
)

Arguments

k

Number of studies to simulate.

n_params

Parameter distribution (or data-generation function; see details) for sample size.

rho_params

List of parameter distributions (or data-generation functions; see details) for correlations. If simulating data from a single fixed population matrix, that matrix can be supplied for this argument (if the diagonal contains non-unity values and 'sigma_params' is not specified, those values will be used as variances).

mu_params

List of parameter distributions (or data-generation functions; see details) for means.

sigma_params

List of parameter distributions (or data-generation functions; see details) for standard deviations.

rel_params

List of parameter distributions (or data-generation functions; see details) for reliabilities.

sr_params

List of parameter distributions (or data-generation functions; see details) for selection ratios.

k_items_params

List of parameter distributions (or data-generation functions; see details) for the number of test items comprising each of the variables to be simulated (all are single-item variables by default).

wt_params

List of parameter distributions (or data-generation functions; see details) to create weights for use in forming composites. If multiple composites are formed, the list should be a list of lists, with the general format: list(comp1_params = list(...params...), comp2_params = list(...params...), etc.).

allow_neg_wt

Logical scalar that determines whether negative weights should be allowed (TRUE) or not (FALSE).

sr_composite_params

Parameter distributions (or data-generation functions; see details) for composite selection ratios.

var_names

Optional vector of variable names for all non-composite variables.

composite_names

Optional vector of names for composite variables.

n_as_ni

Logical argument determining whether n specifies the incumbent sample size (TRUE) or the applicant sample size (FALSE; default). This can only be TRUE when only one variable is involved in selection.

show_applicant

Should applicant data be shown for sample statistics (TRUE) or suppressed (FALSE)?

keep_vars

Optional vector of variable names to be extracted from the simulation and returned in the output object. All variables are returned by default. Use this argument when only some variables are of interest and others are generated solely to serve as selection variables.

decimals

Number of decimals to which statistical results (not parameters) should be rounded. Rounding to 2 decimal places best captures the precision of data available from published primary research.

format

Database format: "long" or "wide."

max_iter

Maximum number of iterations to allow in the parameter selection process before terminating with convergence failure. Must be finite.

...

Additional arguments.

Details

Values supplied as any argument with the suffix "params" can take any of three forms (see Examples for a demonstration of usage):

  • A vector of values from which study parameters should be sampled.

  • A vector containing a mean with a variance or standard deviation. These values must be named "mean," "var," and "sd", respectively, for the program to recognize which value is which.

  • A matrix containing a row of values (this row must be named "values") from which study parameters should be sampled and a row of weights (this row must be labeled 'weights') associated with the values to be sampled.

  • A matrix containing a column of values (this column must be named "values") from which study parameters should be sampled and a column of weights (this column must be labeled 'weights') associated with the values to be sampled.

  • A function that is configured to generate data using only one argument that defines the number of cases to generate, e.g., fun(n = 10).

Value

A database of simulated primary studies' statistics and analytically determined parameter values.

Examples

## Not run: 
## Note the varying methods for defining parameters:
n_params = function(n) rgamma(n, shape = 100)
rho_params <- list(c(.1, .3, .5),
                   c(mean = .3, sd = .05),
                   rbind(value = c(.1, .3, .5), weight = c(1, 2, 1)))
rel_params = list(c(.7, .8, .9),
                  c(mean = .8, sd = .05),
                  rbind(value = c(.7, .8, .9), weight = c(1, 2, 1)))
sr_params = c(list(1, 1, c(.5, .7)))
sr_composite_params = list(1, c(.5, .6, .7))
wt_params = list(list(c(1, 2, 3),
                      c(mean = 2, sd = .25),
                      rbind(value = c(1, 2, 3), weight = c(1, 2, 1))),
                 list(c(1, 2, 3),
                      c(mean = 2, sd = .25),
                      cbind(value = c(1, 2, 3), weight = c(1, 2, 1))))

## Simulate with long format
simulate_r_database(k = 10, n_params = n_params, rho_params = rho_params,
                  rel_params = rel_params, sr_params = sr_params,
                  sr_composite_params = sr_composite_params, wt_params = wt_params,
                  var_names = c("X", "Y", "Z"), format = "long")

## Simulate with wide format
simulate_r_database(k = 10, n_params = n_params, rho_params = rho_params,
                  rel_params = rel_params, sr_params = sr_params,
                  sr_composite_params = sr_composite_params, wt_params = wt_params,
                  var_names = c("X", "Y", "Z"), format = "wide")

## End(Not run)

psychmeta/psychmeta documentation built on Feb. 12, 2024, 1:21 a.m.