randomDGS: Build Data-Generating Structure Randomly

Description Usage Arguments Details Value Contributors Author(s) References See Also Examples

Description

This evaluation function can be used to randomly build data-generating structures. It has initially been programmed for Baumgartner and Thiem (2015) to test the correctness of QCA's three search strategies (conservative/complex, intermediate, parsimonious).

Usage

1
2
randomDGS(n.DGS = 1, exo.facs = c(""), seed.1 = NULL, seed.2 = NULL, prob = 0.5,
          diversity = 1, delete.trivial = FALSE)

Arguments

n.DGS

The number of random data-generating structures to be built.

exo.facs

A character vector with the names of the exogenous factors.

seed.1

The seed for the random generation of output function values.

seed.2

The seed for the random selection of a data-generating structure in cases of ambiguities.

prob

The probability of a positive output function value.

diversity

The diversity index value of a truth table.

delete.trivial

Logical, delete trivial structures.

Details

This function has initially been programmed for Baumgartner and Thiem (2015) to test the correctness of QCA's three solution types (conservative/complex, intermediate, parsimonious). It randomly builds data-generating sructures (DGSs) by means of a two-stage procedure. The first step is the determination of the solution based on a randomly generated truth table; the second step consists in the random selection of exactly one model from among all models that fit the data summarized by this truth table equally well. If there is no model ambiguity besetting the data, the single model that makes up the solution is chosen.

The argument n.DGS is an integer scalar specifying the number of data-generating structures to be built.

The argument exo.facs is a character vector with the names (and thus also the number) of the exogenous factors that can potentially make up the data-generating structure(s). Note that not all of these factors will necessarily makes up the final structure. It may happen that one or more factors will not appear. It is only guaranteed that the data-generating structure cannot be more complex than the most complex structure possible based on the number of exogenous factors specified in exo.facs.

The argument seed.1 sets the seed for the random generation of output function values in the first stage of the two-stage building procedure.

The argument seed.2 sets the seed for the random selection of a data-generating structure in cases of ambiguities.

The argument prob can be used to specify the probability of observing a positive output function value for each minterm in the truth table. It must be a value greater than 0 but smaller than 1. Generally, the higher this probability, the less complex the data-generating structure is.

The argument diversity can be used to insert a third stage in building a data-generating structure between the first and the second stage described above. It creates a truth table with random limited empirical diversity before the solution is derived. It must be a value larger than 0 but not larger than 1. Generally, the lower the diversity index value, the less complex the data-generating structure is and the larger the set of candidate structures in the second stage of the building process.

The logical argument trivial.delete can be used to eliminate trivial structures, that is, TRUE and FALSE, which occur when all output function values are positive, negative respectively. The probability of such trivial structures occurring increases with increasing values to the argument 'prob', decreasing values to the argument 'diversity', and decreasing numbers of exogenous factors. When such a structure is eliminated, the user will be informed by a warning, but the random structure building process will continue unless all structures have been eliminated. When set to FALSE, the process will stop as soon as a single trivial structure would be returned. In this case, users should re-specify the value to the argument seed.1, adjust the values to the arguments prob and/or diversity, or increase the number of exogenous factors.

Value

A list with the following two components:

DGS

A vector of the data-generating structure(s).

tt

The corresponding truth table(s).

Contributors

Thiem, Alrik : development, documentation, programming, testing

Author(s)

Alrik Thiem (Personal Website; ResearchGate Website)

References

Baumgartner, Michael, and Alrik Thiem. 2015. Often Trusted but Never (Properly) Tested: Evaluating Qualitative Comparative Analysis. Paper presented at the 12th Conference of the European Sociological Association, 25-28 August, Czech Technical University, Prague (Czech Republic). Link.

See Also

submodels

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# randomly build a data-generating structure on the basis of five exogenous
# factors, with 0.75 probability of a positive output function value
str.1 <- randomDGS(exo.facs = LETTERS[1:5], seed.1 = 1375, seed.2 = 3917,
                   prob = 0.75)
str.1$DGS

# setting the probability too high and/or the number of exogenous factors too 
# low will increase the likelihood of trivial structures occurring (TRUE and FALSE)

## Not run: 
randomDGS(n.DGS = 5, exo.facs = LETTERS[1:3], seed.1 = 1375, seed.2 = 3917,
          prob = 0.95, delete.trivial = TRUE)

## End(Not run)

# randomly build three data-generating structures on the basis of four
# exogenous factors
str.2 <- randomDGS(n.DGS = 3, exo.facs = LETTERS[1:4], seed.1 = 1375, seed.2 = 3917)
str.2$DGS

# all correctness-preserving submodels of DGS 2, B + AD + cD, can be found with the 
# 'submodels' function
submodels(str.2$DGS[2])$submodels

AlrikThiem/QCApro documentation built on May 5, 2019, 4:55 a.m.