simulate_d_database: Simulate d value databases of primary studies

View source: R/simulate_d.R

simulate_d_databaseR Documentation

Simulate d value databases of primary studies

Description

The simulate_d_database function generates databases of psychometric d value data from sample-size parameters, correlation parameters, mean parameters, standard deviation parameters, reliability parameters, and selection-ratio parameters. The output database can be provided in a long format. If composite variables are to be formed, parameters can also be defined for the weights used to form the composites as well as the selection ratios applied to the composites. This function will return a database of statistics as well as a database of parameters - the parameter database contains the actual study parameters for each simulated sample (without sampleing error) to allow comparisons between meta-analytic results computed from the statistics and the actual means and variances of parameters. The merge_simdat_d function can be used to merge multiple simulated databases and the sparsify_simdat_d function can be used to randomly delete artifact information (a procedure commonly done in simulations of artifact-distribution methods).

Usage

simulate_d_database(
  k,
  n_params,
  rho_params,
  mu_params = NULL,
  sigma_params = 1,
  rel_params = 1,
  sr_params = 1,
  k_items_params = 1,
  wt_params = NULL,
  allow_neg_wt = FALSE,
  sr_composite_params = NULL,
  group_names = NULL,
  var_names = NULL,
  composite_names = NULL,
  diffs_as_obs = FALSE,
  show_applicant = FALSE,
  keep_vars = NULL,
  decimals = 2,
  max_iter = 100,
  ...
)

Arguments

k

Number of studies to simulate.

n_params

List of parameter distributions (or data-generation function; see details) for subgroup sample sizes.

rho_params

List containing a list of parameter distributions (or data-generation functions; see details) for correlations for each simulated group. If simulating data from a single fixed population matrix in each group, supply a list of those matrices for this argument (if the diagonals contains non-unity values and 'sigma_params' argument is not specified, those values will be used as variances).

mu_params

List containing a list of parameter distributions (or data-generation functions; see details) for means for each simulated group. If NULL, all means will be set to zero.

sigma_params

List containing a list of parameter distributions (or data-generation functions; see details) for standard deviations for each simulated group. If NULL, all standard deviations will be set to unity.

rel_params

List containing a list of parameter distributions (or data-generation functions; see details) for reliabilities for each simulated group. If NULL, all reliabilities will be set to unity.

sr_params

List of parameter distributions (or data-generation functions; see details) for selection ratios. If NULL, all selection ratios will be set to unity.

k_items_params

List of parameter distributions (or data-generation functions; see details) for the number of test items comprising each of the variables to be simulated (all are single-item variables by default).

wt_params

List of parameter distributions (or data-generation functions; see details) to create weights for use in forming composites. If multiple composites are formed, the list should be a list of lists, with the general format: list(comp1_params = list(...params...), comp2_params = list(...params...), etc.).

allow_neg_wt

Logical scalar that determines whether negative weights should be allowed (TRUE) or not (FALSE).

sr_composite_params

Parameter distributions (or data-generation functions; see details) for composite selection ratios.

group_names

Optional vector of group names.

var_names

Optional vector of variable names for all non-composite variables.

composite_names

Optional vector of names for composite variables.

diffs_as_obs

Logical scalar that determines whether standard deviation parameters represent standard deviations of observed scores (TRUE) or of true scores (FALSE; default).

show_applicant

Should applicant data be shown for sample statistics (TRUE) or suppressed (FALSE)?

keep_vars

Optional vector of variable names to be extracted from the simulation and returned in the output object. All variables are returned by default. Use this argument when only some variables are of interest and others are generated solely to serve as selection variables.

decimals

Number of decimals to which statistical results (not parameters) should be rounded. Rounding to 2 decimal places best captures the precision of data available from published primary research.

max_iter

Maximum number of iterations to allow in the parameter selection process before terminating with convergence failure. Must be finite.

...

Additional arguments.

Details

Values supplied as any argument with the suffix "params" can take any of three forms (see Examples for a demonstration of usage):

  • A vector of values from which study parameters should be sampled.

  • A vector containing a mean with a variance or standard deviation. These values must be named "mean," "var," and "sd", respectively, for the program to recognize which value is which.

  • A matrix containing a row of values (this row must be named "values") from which study parameters should be sampled and a row of weights (this row must be labeled 'weights') associated with the values to be sampled.

  • A matrix containing a column of values (this column must be named "values") from which study parameters should be sampled and a column of weights (this column must be labeled 'weights') associated with the values to be sampled.

  • A function that is configured to generate data using only one argument that defines the number of cases to generate, e.g., fun(n = 10).

Value

A database of simulated primary studies' statistics and analytically determined parameter values.

Examples

if (requireNamespace("nor1mix", quietly = TRUE)) {
  ## Define sample sizes, means, and other parameters for each of two groups:
  n_params <- list(c(mean = 200, sd = 20),
                   c(mean = 100, sd = 20))
  rho_params <- list(list(c(.3, .4, .5)),
                     list(c(.3, .4, .5)))
  mu_params <- list(list(c(mean = .5, sd = .5), c(-.5, 0, .5)),
                    list(c(mean = 0, sd = .5), c(-.2, 0, .2)))
  sigma_params <- list(list(1, 1),
                       list(1, 1))
  rel_params <- list(list(.8, .8),
                     list(.8, .8))
  sr_params <- list(1, .5)

  simulate_d_database(k = 5, n_params = n_params, rho_params = rho_params,
                      mu_params = mu_params, sigma_params = sigma_params,
                      rel_params = rel_params, sr_params = sr_params,
                      k_items = c(4, 4),
                      group_names = NULL, var_names = c("y1", "y2"),
                      show_applicant = TRUE, keep_vars = c("y1", "y2"), decimals = 2)
}

jadahlke/psychmeta documentation built on Feb. 11, 2024, 9:15 p.m.