simCases: simCases

Description Usage Arguments Value

View source: R/simCases.R

Description

simCases is a function to generate simluated data for mvQCA analysis

Usage

1
2
3
simCases(varTypes = c("B", "B", "B", "M"), complexity = 10,
  ratio = 2, numSolutions = 6, numCases = 50, noiseCases = 0,
  distribute = FALSE, distProp = 0.5)

Arguments

varTypes

A vector (created using the combine or C() function) of variable types. B is a binary variable (0 or 1). M is a multi-value variable (currently 0, 1, or 2). C is a cluster variable that will create two continuous variables based on four underlying clusters. Right now only one C variable is allowed. Defaults to four variables–three binary + one multivalue.

complexity

controls how complex the solutions are by adjusting the frequency of 'not' solutions. Must be greater than 2. Default is 10.

ratio

controls how complex the solutions are by adjusting the ratio of contributing variables to ignored variables in solutions. As ratio approaches 1, more variables are ignored. As ratio increases, fewer variables are ignored. DO NOT enter values less than or equal to one. Default is 2.

numSolutions

is the *maximum* number of paths/solutions leading to the outcome. simCases will remove redundancies and null solutions before outputting the final set of unique solutions, so it is quite likely that a given set of solutions will not reach the maximum. Must be at least one. Default is 6.

numCases

is the number of cases to include in the simulated data set. Must be greater than 1. Default is 50.

noiseCases

is the number of *additional* randomly generated cases used to examine how unknown variables affect the solution. Default is 0.

distribute

determines whether some proportion of the cases are ensured to be the target configurations. Default is FALSE.

distProp

is the proportion of cases that are ensured to be target configurations if distribute is true. Default is 0.50.

Value

A list with components

paths

A data frame with one column (PATHS) that lists the paths retained after removing redundancies and null (nonexistent in the simulated data) paths. Paths are described using curly-bracket notation: factor{levels}.

opaths

A dataframe with one column (PATHS) that lists the originally generated paths including those that could have existed but may not be present in the final path list have due to nonexistent cases. Paths are described using curly-bracket notation: factor{levels}.

caseData

A dataframe that includes all of the cases in the simulated dataset. Levels are indicated by number. A C variable will generate two columns (.e.g D1 and D2) with continuous numbers between 0 and 1. The outcome variable is binary and is indicated by the OUT column.

clusters

A vector with the cluster membership for each case if a C variable is requested.

allSolutions

A dataframe with the full set of possible paths generated by simCases. Levels are indicated by number; -1 refers to any level.

uniqueSolutions

A dataframe with the simplest paths that cover all generated solutions with existing case data. Levels are indicated by number; -1 refers to any level.

redundantSolutions

A dataframe with redundant (more complex) paths that are covered by a simpler unique path. Levels are indicated by number; -1 refers to any level.

comparator

A data frame with information currently used for checking the redundancy reduction algorithm. See code for elements.


cognopod/corners documentation built on May 12, 2021, 9:25 a.m.