eQMC: Minimization with Enhanced Quine-McCluskey Algorithm

Description Usage Arguments Details Value Contributors Author(s) References See Also Examples

View source: R/eQMC.R

Description

This function performs the minimization. Although it is called 'eQMC', the implemented algorithm is different from the classical Quine-McCluskey (QMC) algorithm. Instead of QMC's approach of using positive minterms and remainders to perform minimization, eQMC uses positive and negative minterms, but no remainders. See Dusa and Thiem (2015) and Thiem (2015) for more details.

Usage

1
2
3
4
5
6
7
8
eQMC(data, outcome = c(""), neg.out = FALSE, exo.facs = c(""), 
     relation = "suf", n.cut = 1, incl.cut1 = 1, incl.cut0 = 1, 
     minimize = c("1"), sol.type = "ps", row.dom = FALSE, 
     min.dis = FALSE, omit = c(), dir.exp = c(), details = FALSE,
     show.cases = FALSE, inf.test = c(""), use.tilde = FALSE, 
     use.letters = FALSE, ...)

is.qca(x)

Arguments

data

A truth table object or a set of configurational data (of class 'matrix' or 'data.frame').

outcome

A character vector of outcomes.

neg.out

Logical, use negation of outcome (ignored if data is a truth table object).

exo.facs

A character vector with the names of the exogenous factors.

relation

The required relation of a model antecendent to the outcome; either "suf" (only sufficiency required) or "sufnec" (both sufficiency and necessity required).

n.cut

The minimum number of cases with set membership score above 0.5 for an output function value of "0", "1" or "C"; an integer between 1 and the maximum number of cases for all non-remainder minterms.

incl.cut1

The minimum sufficiency inclusion score for an output function value of "1".

incl.cut0

The maximum sufficiency inclusion score for an output function value of "0".

minimize

A vector of output function values for which a solution is sought.

sol.type

A character scalar specifying the QCA solution type that should be applied; either "ps" (parsimonious solution), "ps+" (parsimonious solution including both positive and contradiction minterms), "cs" ( conservative solution) or "cs+" (conservative solution including both positive and contradiction minterms). Note that only "ps" and "ps+" generate correct solutions.

row.dom

Logical, impose row dominance as a constraint on the solution to eliminate dominated inessential prime implicants. For causal data analysis, this argument must be set to FALSE.

min.dis

Logical, impose minimal disjunctivity as a constraint on the solution to eliminate models with more prime implicants than the model(s) with the fewest prime implicants. For causal data analysis, this argument must be set to FALSE.

omit

A vector of minterm index values or a matrix of minterms to be omitted from minimization.

dir.exp

A vector of directional expectations for deriving intermediate solutions; can only be used in conjunction with sol.type = "ps" or sol.type = "ps+". Note that neither conservative nor intermediate solutions produce correct solutions. This argument is only retained for purposes of method evaluation.

details

Logical, present solution details (inclusion, raw coverage and unique coverage scores).

show.cases

Logical, also print case names as part of a solution's details; details must be set to TRUE (do not use this option with many cases and/or long case names).

inf.test

A vector of length two specifying the inference-statistical test to be performed (currently only "binom") and the critical significance level.

use.tilde

Logical, use tilde operator ("~") for negation with bivalent (crisp-set and fuzzy-set) factors.

use.letters

Logical, use single letters (in alphabetical order) instead of original variable names.

...

Other arguments.

x

An object of class 'qca'.

Details

The argument data can be a truth table object (an object of class 'tt' returned by the truthTable function) or a suitable data set. Suitable data sets have the following structure: values of 0 and 1 for bivalent crisp-set factors, values between 0 and 1 for bivalent fuzzy-set factors, and values beginning with 0 at increments of 1 for multivalent crisp-set factors. The placeholders "-" and "dc" indicate "don't cares" in auxiliary factors that specify temporal order between other substantive factors in tQCA. These values lead to the exclusion of the auxiliary factor from the computation of parameters of fit.

The argument outcome specifies the outcome to be analyzed, either in curly-bracket notation (e.g., O{value}) if the outcome is from a multivalent (or a bivalent) factor, or in upper-case notation if the outcome is from a bivalent factor (e.g., O as a short-cut for O{1}). Outcomes from multivalent crisp-set factors always require curly-bracket notation. Outcomes can be single levels of factors not simultaneously passed to exo.facs, or levels from any subset of the factors specified in exo.facs if data is not a truth table object. At least one outcome has to be specified.

If multiple outcomes are specified, their factors must also be specified in exo.facs. In this case, solution details will not be printed by default (see the example on mimicking Coincidence Analysis below).

The logical argument neg.out controls whether outcome is to be analyzed or its negation. If outcome is a level from a multivalent factor, neg.out = TRUE makes the disjunction of all remaining levels the outcome.

The argument exo.facs specifies the exogenous factors. If omitted, all factors in data are used except that of the outcome. With multiple outcomes, all factors in data are used. Please note that computation times may increase significantly beyond 17 exogenous factors, and that the computation of a solution may not be possible at all depending on end-user machine constraints.

The argument relation specifies the relation between the antecedent of a model and the outcome. It accepts either the value "suf" or "sufnec". If relation = "suf" (default), only sufficiency is used as a criterion in identifying a model. If relation = "sufnec", models must be sufficient and necessary for the outcome to be identified. The argument incl.cut1 then acts as the cut-off for the sufficiency inclusion of a minterm as well as the necessity inclusion of the final model(s).

Minterms that contain fewer than n.cut cases with membership scores above 0.5 are coded as remainders (OUT = "?"). If the number of such cases is at least n.cut, minterms with an inclusion score of at least incl.cut1 are coded positive (OUT = "1"), minterms with an inclusion score below incl.cut1 but with at least incl.cut0 are coded as a contradiction (OUT = "C"), and minterms with an inclusion score below incl.cut0 are coded negative (OUT = "0"). If incl.cut0 is not explicitly changed, it is set equal to incl.cut1.

The argument minimize specifies a vector of suitable values of the output function for which a solution is sought. Vectors of such values are "1" (default; positive minterms), "C" (contradictions), "0" (negative minterms), c("1", "C") and c("0", "C"), but not c("1", "0") and c("1", "0", "C"). Note that for "0", "C" and c("0", "C"), the respective minterms will be processed but no solution details will be printed. Also note that minimize = "0" is not the same as using neg.out = TRUE.

The argument sol.type specifies the QCA solution type that should be generated. It accepts either "ps" (default, parsimonious solution), "ps+" (parsimonious solution including both positive minterms and contradictions), "cs" (conservative solution) or "cs+" (conservative solution including both positive minterms and contradictions). As only the parsimonious search strategy generates methodologically correct solutions (Baumgartner and Thiem 2017a), sol.type should not normally be changed to generate conservative or intermediate solutions.

The logical argument row.dom controls whether the principle of row dominance is imposed as a constraint on the solution. An inessential prime implicant P dominates another Q if all configurations covered by Q are also covered by P, but they are not interchangeable (cf. McCluskey 1956, 1425; McCluskey 1965, 164-152). If row dominance is operative, models that contain dominated prime implicants will not be returned. For purposes of causal data analysis, row.dom must be set to FALSE.

The logical argument min.dis controls whether the principle of minimal disjunctivity is imposed as a constraint on the solution (McCluskey 1965, 123-126). If minimal disjunctivity is operative, models that contain more than the number of prime implicants of the model(s) with the fewest prime implicants will not be returned. For purposes of causal data analysis, both row.dom and min.dis must be set to FALSE (Baumgartner and Thiem 2017b; Thiem 2014b).

The argument omit can be used to omit minterms from the minimization process ex ante. It accepts a vector of row numbers from the truth table or a matrix of minterms of the same order as passed to the truthTable function (if the argument data is a truth table object) or as specified in the argument exo.facs.

Neither the conservative nor the intermediate search strategy of QCA produce correct solutions (Baumgartner and Thiem 2017a). The dir.exp argument is retained only for purposes of method evaluation in relation to intermediate solutions. It specifies directional expectations for separating easy from difficult counterfactuals in simplifying assumptions. For bivalent crisp and fuzzy-set factors, expectations should be specified as a vector of the same length and the same order of condition variables as provided in exo.facs. For bivalent factors, a value of either "0" or "1" indicates that the corresponding factor is expected to contribute to a positive output function value, while a dash, "-", indicates that one or the other level of the corresponding factor does so. For multivalent factors, multiple levels have to be enclosed by double quotes and separated by a semicolon (see mvQCA example using Hartmann and Kemmerzell (2010) below). In some situations, directional expectations in mvQCA generate easy counterfactuals that do not contribute to parsimony (Thiem 2014a).

If details = TRUE, parameters of fit (inclusion, raw coverage, and unique coverage) will be printed for each solution and its respective prime implicants. Essential prime implicants are listed first in the solution output and in the top part of the parameters-of-fit table. Inessential prime implicants are listed in brackets in the solution output and in the middle part of the parameters-of-fit table, together with their unique coverage scores under each individual model. Inclusion and coverage scores for each model are provided in the bottom part of the parameters-of-fit table.

The logical argument show.cases controls whether case names are displayed next to their corresponding prime implicants (do not use with many cases and/or long case names!). In the parameters-of-fit table, semicolons separate cases from different minterms, whereas commas separate cases from the same minterm.

The argument inf.test provides functionality for basing output function value codings on inference-statistical tests. Currently, only an exact binomial test ("binom") is available, which requires the data to contain only bivalent or multivalent crisp-set factors. The argument requires a vector of length two, comprising the test and a critical significance level. If the empirical inclusion score of a minterm is not significantly lower than incl.cut1, it will be coded positive (OUT = "1"). If it is significantly lower than incl.cut1 yet still significantly higher than incl.cut0, it will be coded as a contradiction (OUT = "C"). If it is not significantly higher than incl.cut0, it will be coded negative (OUT = "0").

The argument use.tilde should only be used for bivalent factors. If the exogenous factors are already named with single letters, the argument use.letters will have no effect when set to TRUE. Otherwise, upper-case letters will replace original factor names in alphabetical order.

Value

An object of class 'qca' for single outcomes and 'mqca' for multiple outcomes. Objects of class 'qca' are lists with the following ten main components:

tt

The truth table object.

excluded

The line numbers of the negative minterms.

initials

The positive (non-remainder) minterms.

PIs

The prime implicants.

PIchart

The list of prime implicant charts.

solution

The list of solutions.

essential

The list of essential prime implicants.

pims

The list of model prime implicant set membership scores.

SA

The list of simplifying assumptions that would have been used by Quine-McCluskey minimization.

i.sol

A list of components specific to intermediate solution(s), including the prime implicant chart, model prime implicant membership scores, (non-simplifying) easy counterfactuals and difficult counterfactuals.

Contributors

Dusa, Adrian: development, programming
Thiem, Alrik: development, documentation, testing

Author(s)

Alrik Thiem (Personal Website; ResearchGate Website)

References

Baumgartner, Michael. 2009. “Inferring Causal Complexity.” Sociological Methods & Research 38 (1):71-101. DOI: 10.1177/0049124109339369.

Baumgartner, Michael, and Alrik Thiem. 2017a. “Often Trusted but Never (Properly) Tested: Evaluating Qualitative Comparative Analysis.” Sociological Methods & Research. Advance online publication. DOI: 10.1177/0049124117701487.

Baumgartner, Michael, and Alrik Thiem. 2017b. “Model Ambiguities in Configurational Comparative Research.” Sociological Methods & Research 46 (4):954-87. DOI: 10.1177/0049124115610351.

Dusa, Adrian, and Alrik Thiem. 2015. “Enhancing the Minimization of Boolean and Multivalue Output Functions with eQMC.” Journal of Mathematical Sociology 39 (2):92-108. DOI: 10.1080/0022250X.2014.897949.

Emmenegger, Patrick. 2011. “Job Security Regulations in Western Democracies: A Fuzzy Set Analysis.” European Journal of Political Research 50 (3):336-64. DOI: 10.1111/j.1475-6765.2010.01933.x.

Hartmann, Christof, and Joerg Kemmerzell. 2010. “Understanding Variations in Party Bans in Africa.” Democratization 17 (4):642-65. DOI: 10.1080/13510347.2010.491189.

McCluskey, Edward J. 1956. “Minimization of Boolean Functions.” Bell Systems Technical Journal 35 (6):1417-44. DOI: 10.1002/j.1538-7305.1956.tb03835.x.

McCluskey, Edward J. 1965. Introduction to the Theory of Switching Circuits. Princeton: Princeton University Press.

Krook, Mona Lena. 2010. “Women's Representation in Parliament: A Qualitative Comparative Analysis.” Political Studies 58 (5):886-908. DOI: 10.1111/j.1467-9248.2010.00833.x.

Ragin, Charles C. 2008. Redesigning Social Inquiry: Fuzzy Sets and Beyond. Chicago: University of Chicago Press. Link.

Schneider, Carsten Q., and Claudius Wagemann. 2012. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis (QCA). Cambridge: Cambridge University Press. Link.

Thiem, Alrik. 2014a. “Parameters of Fit and Intermediate Solutions in Multi-Value Qualitative Comparative Analysis.” Quality & Quantity 49 (2):657-74. DOI: 10.1007/s11135-014-0015-x.

Thiem, Alrik. 2014b. “Navigating the Complexities of Qualitative Comparative Analysis: Case Numbers, Necessity Relations, and Model Ambiguities.” Evaluation Review 38 (6):487-513. DOI: 10.1177/0193841x14550863.

Thiem, Alrik. 2015. “Using Qualitative Comparative Analysis for Identifying Causal Chains in Configurational Data: A Methodological Commentary on Baumgartner and Epple (2014).” Sociological Methods & Research 44 (4):723-36. DOI: 10.1177/0049124115589032.

See Also

pof, truthTable

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# csQCA using Krook (2010)
#-------------------------
data(d.represent)
head(d.represent)

# solution with details and case names
KRO <- eQMC(d.represent, outcome = "WNP", details = TRUE, show.cases = TRUE)
KRO

# check PI chart
KRO$PIchart

# solution with truth table object
KRO.tt <- truthTable(d.represent, outcome = "WNP")
KRO <- eQMC(KRO.tt)
KRO

# simplifying assumptions (SAs) that would have been used with Quine-McCluskey 
# optimization
KRO$SA

# fsQCA using Emmenegger (2011)
#------------------------------
data(d.jobsecurity)
head(d.jobsecurity)

# solution with details
EMM <- eQMC(d.jobsecurity, outcome = "JSR", incl.cut1 = 0.9, details = TRUE)
EMM

# are the model prime implicants also sufficient for the negation of the outcome?
pof(EMM$pims, outcome = "JSR", d.jobsecurity, neg.out = TRUE, relation = "suf")

# are the negations of the model prime implicants also sufficient for the outcome?
pof(1 - EMM$pims, outcome = "JSR", d.jobsecurity, relation = "suf")

# plot all three prime implicants of the solution
PIsc <- EMM$pims
par(mfrow = c(1, 3))
for(i in 1:3){
 plot(PIsc[, i], d.jobsecurity$JSR, pch = 19, ylab = "JSR",
  xlab = names(PIsc)[i], xlim = c(0, 1), ylim = c(0, 1),
  main = paste("Prime Implicant", print(i)))
 mtext(paste(
  "Inclusion = ", round(EMM$IC$overall$incl.cov$incl[i], 3),
  "; Coverage = ", round(EMM$IC$overall$incl.cov$cov.r[i], 3)), 
  cex = 0.7, line = 0.4)
 abline(h = 0.5, lty = 2, col = gray(0.5))
 abline(v = 0.5, lty = 2, col = gray(0.5))
 abline(0, 1)
}

# mvQCA using Hartmann and Kemmerzell (2010)
#-------------------------------------------
data(d.partybans)
head(d.partybans)

# specify exogenous factors beforehand
exo.facs <- c("C", "F", "T", "V")

# parsimonious solution with contradictions included
HK.sol <- eQMC(d.partybans, outcome = "PB{1}", exo.facs = exo.facs,
  incl.cut0 = 0.4, sol.type = "ps+", details = TRUE)
HK.sol

# which are the two countries in T{2} but not PB{1}?
rownames(d.partybans[d.partybans$T == 2 & d.partybans$PB != 1, ])

# QCA with multiple outcomes from multivalent variables
#------------------------------------------------------
d.mmv <- data.frame(A = c(2,0,0,1,1,1,2,2), B = c(2,2,2,2,1,1,0,0), 
                    C = c(0,1,0,0,0,2,1,0), D = c(2,1,2,2,3,1,3,0), 
                    E = c(3,2,3,3,0,1,3,2), 
  row.names = letters[1:8])
head(d.mmv)

mmv.s <- eQMC(d.mmv, outcome = c("D{2}", "E{3}"))
mmv.s

# use quotes with curly-bracket notation to access solution component
print(mmv.s$"E{3}", details = TRUE, show.cases = TRUE)

# negation of outcome from multivalent factor is disjunction of all other
# levels; high under-determination (18 models)
mmv.s <- eQMC(d.mmv, outcome = "E{3}", neg.out = TRUE)
mmv.s

# causal chains with QCA (Thiem 2015); data from Baumgartner (2009)
#-----------------------------------------------------------------------------
d.Bau <- data.frame(
  U = c(1,1,1,1,0,0,0,0), D = c(1,1,0,0,1,1,0,0),
  L = c(1,1,1,1,1,1,0,0), G = c(1,0,1,0,1,0,1,0),
  E = c(1,1,1,1,1,1,1,0),
  row.names = letters[1:8])
head(d.Bau)

# with multiple outcomes, no solution details are printed;
# "causal-chain structure": (D + U <=> L) * (G + L <=> E)
# "common-cause structure": (D + U <=> L) * (G + D + U <=> E)
Bau.cna <- eQMC(d.Bau, outcome = names(d.Bau), relation = "sufnec")
Bau.cna

# get the truth table, solution details and case names for outcome "E"
print(Bau.cna$E, details = TRUE, show.cases = TRUE)

# examples relating to QCA method evaluation
#-------------------------------------------
#
# is the conservative solution (QCA-CS) really "conservative"?
#-------------------------------------------------------------
# Ragin (2008, 173): "The complex [conservative] solution [...] does not 
# permit any counterfactual cases and thus no simplifying assumptions 
# regarding combinations of conditions that do not exist in the data.";
# the conservative solution is "[c]onservative because [...] the
# researcher [...] is exclusively guided by the empirical information 
# at hand" (Schneider and Wagemann 2012, 162)
#
# in fact, QCA-CS makes extremely strong assumptions on ALL remainders;
# QCA-CS assumes every remainder exists at least 'freq.cut' times, 
# and occurs with the negation of the outcome more than 
# 'freq.cut' * (1 - 'incl.cut1') times

# create a test data-set 'CS' with 32 cases and randomly assign values 
# on the endogenous factor 'Z'
CS <- data.frame(mintermMatrix(rep(2,5)))
CS$Z <- sample(0:1, 2^5, replace = TRUE)

# randomly draw 20 cases to create a limitedly diverse data-set 'CS.LD'
# and turn all 12 remainder minterms into observations that occur with 
# 'Z = 0' in original data-set 'CS'
CS.LD <- CS[sample(1:2^5, 20), ]
change <- as.numeric(setdiff(rownames(CS), rownames(CS.LD)))
CS$Z[change] <- 0

# create the (conservative) solutions for 'CS' and 'CS.LD'
CS.sol <- eQMC(CS, outcome = "Z")
CS.LD.sol <- eQMC(CS.LD, outcome = "Z", sol.type = "cs")

# test whether the two solutions are identical
identical(unlist(CS.LD.sol$solution), unlist(CS.sol$solution))

# both solutions are identical, for two datasets that do not allow the same
# causal inferences to be made; this indicates that QCA-CS draws causal inferences 
# beyond what the data warrants; the lower the diversity index (ratio of non-remainder
# minterms to all minterms), the stronger the assumptions QCA-CS makes

QCApro documentation built on May 1, 2019, 10:09 p.m.

Related to eQMC in QCApro...