# eQMC: Minimization with Enhanced Quine-McCluskey Algorithm In QCApro: Advanced Functionality for Performing and Evaluating Qualitative Comparative Analysis

## Description

This function performs the minimization. Although it is called 'eQMC', the implemented algorithm is different from the classical Quine-McCluskey (QMC) algorithm. Instead of QMC's approach of using positive minterms and remainders to perform minimization, eQMC uses positive and negative minterms, but no remainders. See Dusa and Thiem (2015) and Thiem (2015) for more details.

## Usage

 ```1 2 3 4 5 6 7 8``` ```eQMC(data, outcome = c(""), neg.out = FALSE, exo.facs = c(""), relation = "suf", n.cut = 1, incl.cut1 = 1, incl.cut0 = 1, minimize = c("1"), sol.type = "ps", row.dom = FALSE, min.dis = FALSE, omit = c(), dir.exp = c(), details = FALSE, show.cases = FALSE, inf.test = c(""), use.tilde = FALSE, use.letters = FALSE, ...) is.qca(x) ```

## Arguments

 `data` A truth table object or a set of configurational data (of class 'matrix' or 'data.frame'). `outcome` A character vector of outcomes. `neg.out` Logical, use negation of `outcome` (ignored if `data` is a truth table object). `exo.facs` A character vector with the names of the exogenous factors. `relation` The required relation of a model antecendent to the `outcome`; either `"suf"` (only sufficiency required) or `"sufnec"` (both sufficiency and necessity required). `n.cut` The minimum number of cases with set membership score above 0.5 for an output function value of "0", "1" or "C"; an integer between 1 and the maximum number of cases for all non-remainder minterms. `incl.cut1` The minimum sufficiency inclusion score for an output function value of "1". `incl.cut0` The maximum sufficiency inclusion score for an output function value of "0". `minimize` A vector of output function values for which a solution is sought. `sol.type` A character scalar specifying the QCA solution type that should be applied; either "ps" (parsimonious solution), "ps+" (parsimonious solution including both positive and contradiction minterms), "cs" ( conservative solution) or "cs+" (conservative solution including both positive and contradiction minterms). Note that only "ps" and "ps+" generate correct solutions. `row.dom` Logical, impose row dominance as a constraint on the solution to eliminate dominated inessential prime implicants. For causal data analysis, this argument must be set to `FALSE`. `min.dis` Logical, impose minimal disjunctivity as a constraint on the solution to eliminate models with more prime implicants than the model(s) with the fewest prime implicants. For causal data analysis, this argument must be set to `FALSE`. `omit` A vector of minterm index values or a matrix of minterms to be omitted from minimization. `dir.exp` A vector of directional expectations for deriving intermediate solutions; can only be used in conjunction with `sol.type = "ps"` or `sol.type = "ps+"`. Note that neither conservative nor intermediate solutions produce correct solutions. This argument is only retained for purposes of method evaluation. `details` Logical, present solution details (inclusion, raw coverage and unique coverage scores). `show.cases` Logical, also print case names as part of a solution's details; `details` must be set to `TRUE` (do not use this option with many cases and/or long case names). `inf.test` A vector of length two specifying the inference-statistical test to be performed (currently only `"binom"`) and the critical significance level. `use.tilde` Logical, use tilde operator ("~") for negation with bivalent (crisp-set and fuzzy-set) factors. `use.letters` Logical, use single letters (in alphabetical order) instead of original variable names. `...` Other arguments. `x` An object of class 'qca'.

## Details

The argument `data` can be a truth table object (an object of class 'tt' returned by the `truthTable` function) or a suitable data set. Suitable data sets have the following structure: values of 0 and 1 for bivalent crisp-set factors, values between 0 and 1 for bivalent fuzzy-set factors, and values beginning with 0 at increments of 1 for multivalent crisp-set factors. The placeholders "-" and "dc" indicate "don't cares" in auxiliary factors that specify temporal order between other substantive factors in tQCA. These values lead to the exclusion of the auxiliary factor from the computation of parameters of fit.

The argument `outcome` specifies the outcome to be analyzed, either in curly-bracket notation (e.g., `O{value}`) if the outcome is from a multivalent (or a bivalent) factor, or in upper-case notation if the outcome is from a bivalent factor (e.g., `O` as a short-cut for `O{1}`). Outcomes from multivalent crisp-set factors always require curly-bracket notation. Outcomes can be single levels of factors not simultaneously passed to `exo.facs`, or levels from any subset of the factors specified in `exo.facs` if `data` is not a truth table object. At least one outcome has to be specified.

If multiple outcomes are specified, their factors must also be specified in `exo.facs`. In this case, solution details will not be printed by default (see the example on mimicking Coincidence Analysis below).

The logical argument `neg.out` controls whether `outcome` is to be analyzed or its negation. If `outcome` is a level from a multivalent factor, `neg.out = TRUE` makes the disjunction of all remaining levels the outcome.

The argument `exo.facs` specifies the exogenous factors. If omitted, all factors in `data` are used except that of the `outcome`. With multiple outcomes, all factors in `data` are used. Please note that computation times may increase significantly beyond 17 exogenous factors, and that the computation of a solution may not be possible at all depending on end-user machine constraints.

The argument `relation` specifies the relation between the antecedent of a model and the outcome. It accepts either the value `"suf"` or `"sufnec"`. If `relation = "suf"` (default), only sufficiency is used as a criterion in identifying a model. If `relation = "sufnec"`, models must be sufficient and necessary for the outcome to be identified. The argument `incl.cut1` then acts as the cut-off for the sufficiency inclusion of a minterm as well as the necessity inclusion of the final model(s).

Minterms that contain fewer than `n.cut` cases with membership scores above 0.5 are coded as remainders (`OUT = "?"`). If the number of such cases is at least `n.cut`, minterms with an inclusion score of at least `incl.cut1` are coded positive (`OUT = "1"`), minterms with an inclusion score below `incl.cut1` but with at least `incl.cut0` are coded as a contradiction (`OUT = "C"`), and minterms with an inclusion score below `incl.cut0` are coded negative (`OUT = "0"`). If `incl.cut0` is not explicitly changed, it is set equal to `incl.cut1`.

The argument `minimize` specifies a vector of suitable values of the output function for which a solution is sought. Vectors of such values are `"1"` (default; positive minterms), `"C"` (contradictions), `"0"` (negative minterms), `c("1", "C")` and `c("0", "C")`, but not `c("1", "0")` and `c("1", "0", "C")`. Note that for `"0"`, `"C"` and `c("0", "C")`, the respective minterms will be processed but no solution details will be printed. Also note that `minimize = "0"` is not the same as using `neg.out = TRUE`.

The argument `sol.type` specifies the QCA solution type that should be generated. It accepts either `"ps"` (default, parsimonious solution), `"ps+"` (parsimonious solution including both positive minterms and contradictions), `"cs"` (conservative solution) or `"cs+"` (conservative solution including both positive minterms and contradictions). As only the parsimonious search strategy generates methodologically correct solutions (Baumgartner and Thiem 2017a), `sol.type` should not normally be changed to generate conservative or intermediate solutions.

The logical argument `row.dom` controls whether the principle of row dominance is imposed as a constraint on the solution. An inessential prime implicant P dominates another Q if all configurations covered by Q are also covered by P, but they are not interchangeable (cf. McCluskey 1956, 1425; McCluskey 1965, 164-152). If row dominance is operative, models that contain dominated prime implicants will not be returned. For purposes of causal data analysis, `row.dom` must be set to `FALSE`.

The logical argument `min.dis` controls whether the principle of minimal disjunctivity is imposed as a constraint on the solution (McCluskey 1965, 123-126). If minimal disjunctivity is operative, models that contain more than the number of prime implicants of the model(s) with the fewest prime implicants will not be returned. For purposes of causal data analysis, both `row.dom` and `min.dis` must be set to `FALSE` (Baumgartner and Thiem 2017b; Thiem 2014b).

The argument `omit` can be used to omit minterms from the minimization process ex ante. It accepts a vector of row numbers from the truth table or a matrix of minterms of the same order as passed to the `truthTable` function (if the argument `data` is a truth table object) or as specified in the argument `exo.facs`.

Neither the conservative nor the intermediate search strategy of QCA produce correct solutions (Baumgartner and Thiem 2017a). The `dir.exp` argument is retained only for purposes of method evaluation in relation to intermediate solutions. It specifies directional expectations for separating easy from difficult counterfactuals in simplifying assumptions. For bivalent crisp and fuzzy-set factors, expectations should be specified as a vector of the same length and the same order of condition variables as provided in `exo.facs`. For bivalent factors, a value of either "0" or "1" indicates that the corresponding factor is expected to contribute to a positive output function value, while a dash, "-", indicates that one or the other level of the corresponding factor does so. For multivalent factors, multiple levels have to be enclosed by double quotes and separated by a semicolon (see mvQCA example using Hartmann and Kemmerzell (2010) below). In some situations, directional expectations in mvQCA generate easy counterfactuals that do not contribute to parsimony (Thiem 2014a).

If `details = TRUE`, parameters of fit (inclusion, raw coverage, and unique coverage) will be printed for each solution and its respective prime implicants. Essential prime implicants are listed first in the solution output and in the top part of the parameters-of-fit table. Inessential prime implicants are listed in brackets in the solution output and in the middle part of the parameters-of-fit table, together with their unique coverage scores under each individual model. Inclusion and coverage scores for each model are provided in the bottom part of the parameters-of-fit table.

The logical argument `show.cases` controls whether case names are displayed next to their corresponding prime implicants (do not use with many cases and/or long case names!). In the parameters-of-fit table, semicolons separate cases from different minterms, whereas commas separate cases from the same minterm.

The argument `inf.test` provides functionality for basing output function value codings on inference-statistical tests. Currently, only an exact binomial test (`"binom"`) is available, which requires the data to contain only bivalent or multivalent crisp-set factors. The argument requires a vector of length two, comprising the test and a critical significance level. If the empirical inclusion score of a minterm is not significantly lower than `incl.cut1`, it will be coded positive (`OUT = "1"`). If it is significantly lower than `incl.cut1` yet still significantly higher than `incl.cut0`, it will be coded as a contradiction (`OUT = "C"`). If it is not significantly higher than `incl.cut0`, it will be coded negative (`OUT = "0"`).

The argument `use.tilde` should only be used for bivalent factors. If the exogenous factors are already named with single letters, the argument `use.letters` will have no effect when set to `TRUE`. Otherwise, upper-case letters will replace original factor names in alphabetical order.

## Value

An object of class 'qca' for single outcomes and 'mqca' for multiple outcomes. Objects of class 'qca' are lists with the following ten main components:

 `tt` The truth table object. `excluded` The line numbers of the negative minterms. `initials` The positive (non-remainder) minterms. `PIs` The prime implicants. `PIchart` The list of prime implicant charts. `solution` The list of solutions. `essential` The list of essential prime implicants. `pims` The list of model prime implicant set membership scores. `SA` The list of simplifying assumptions that would have been used by Quine-McCluskey minimization. `i.sol` A list of components specific to intermediate solution(s), including the prime implicant chart, model prime implicant membership scores, (non-simplifying) easy counterfactuals and difficult counterfactuals.

## Contributors

 Dusa, Adrian : development, programming Thiem, Alrik : development, documentation, testing

## Author(s)

Alrik Thiem (Personal Website; ResearchGate Website)

## References

Baumgartner, Michael. 2009. “Inferring Causal Complexity.” Sociological Methods & Research 38 (1):71-101. DOI: 10.1177/0049124109339369.

Baumgartner, Michael, and Alrik Thiem. 2017a. “Often Trusted but Never (Properly) Tested: Evaluating Qualitative Comparative Analysis.” Sociological Methods & Research. Advance online publication. DOI: 10.1177/0049124117701487.

Baumgartner, Michael, and Alrik Thiem. 2017b. “Model Ambiguities in Configurational Comparative Research.” Sociological Methods & Research 46 (4):954-87. DOI: 10.1177/0049124115610351.

Dusa, Adrian, and Alrik Thiem. 2015. “Enhancing the Minimization of Boolean and Multivalue Output Functions with eQMC.” Journal of Mathematical Sociology 39 (2):92-108. DOI: 10.1080/0022250X.2014.897949.

Emmenegger, Patrick. 2011. “Job Security Regulations in Western Democracies: A Fuzzy Set Analysis.” European Journal of Political Research 50 (3):336-64. DOI: 10.1111/j.1475-6765.2010.01933.x.

Hartmann, Christof, and Joerg Kemmerzell. 2010. “Understanding Variations in Party Bans in Africa.” Democratization 17 (4):642-65. DOI: 10.1080/13510347.2010.491189.

McCluskey, Edward J. 1956. “Minimization of Boolean Functions.” Bell Systems Technical Journal 35 (6):1417-44. DOI: 10.1002/j.1538-7305.1956.tb03835.x.

McCluskey, Edward J. 1965. Introduction to the Theory of Switching Circuits. Princeton: Princeton University Press.

Krook, Mona Lena. 2010. “Women's Representation in Parliament: A Qualitative Comparative Analysis.” Political Studies 58 (5):886-908. DOI: 10.1111/j.1467-9248.2010.00833.x.

Ragin, Charles C. 2008. Redesigning Social Inquiry: Fuzzy Sets and Beyond. Chicago: University of Chicago Press. Link.

Schneider, Carsten Q., and Claudius Wagemann. 2012. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis (QCA). Cambridge: Cambridge University Press. Link.

Thiem, Alrik. 2014a. “Parameters of Fit and Intermediate Solutions in Multi-Value Qualitative Comparative Analysis.” Quality & Quantity 49 (2):657-74. DOI: 10.1007/s11135-014-0015-x.

Thiem, Alrik. 2014b. “Navigating the Complexities of Qualitative Comparative Analysis: Case Numbers, Necessity Relations, and Model Ambiguities.” Evaluation Review 38 (6):487-513. DOI: 10.1177/0193841x14550863.

Thiem, Alrik. 2015. “Using Qualitative Comparative Analysis for Identifying Causal Chains in Configurational Data: A Methodological Commentary on Baumgartner and Epple (2014).” Sociological Methods & Research 44 (4):723-36. DOI: 10.1177/0049124115589032.

`pof`, `truthTable`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145``` ```# csQCA using Krook (2010) #------------------------- data(d.represent) head(d.represent) # solution with details and case names KRO <- eQMC(d.represent, outcome = "WNP", details = TRUE, show.cases = TRUE) KRO # check PI chart KRO\$PIchart # solution with truth table object KRO.tt <- truthTable(d.represent, outcome = "WNP") KRO <- eQMC(KRO.tt) KRO # simplifying assumptions (SAs) that would have been used with Quine-McCluskey # optimization KRO\$SA # fsQCA using Emmenegger (2011) #------------------------------ data(d.jobsecurity) head(d.jobsecurity) # solution with details EMM <- eQMC(d.jobsecurity, outcome = "JSR", incl.cut1 = 0.9, details = TRUE) EMM # are the model prime implicants also sufficient for the negation of the outcome? pof(EMM\$pims, outcome = "JSR", d.jobsecurity, neg.out = TRUE, relation = "suf") # are the negations of the model prime implicants also sufficient for the outcome? pof(1 - EMM\$pims, outcome = "JSR", d.jobsecurity, relation = "suf") # plot all three prime implicants of the solution PIsc <- EMM\$pims par(mfrow = c(1, 3)) for(i in 1:3){ plot(PIsc[, i], d.jobsecurity\$JSR, pch = 19, ylab = "JSR", xlab = names(PIsc)[i], xlim = c(0, 1), ylim = c(0, 1), main = paste("Prime Implicant", print(i))) mtext(paste( "Inclusion = ", round(EMM\$IC\$overall\$incl.cov\$incl[i], 3), "; Coverage = ", round(EMM\$IC\$overall\$incl.cov\$cov.r[i], 3)), cex = 0.7, line = 0.4) abline(h = 0.5, lty = 2, col = gray(0.5)) abline(v = 0.5, lty = 2, col = gray(0.5)) abline(0, 1) } # mvQCA using Hartmann and Kemmerzell (2010) #------------------------------------------- data(d.partybans) head(d.partybans) # specify exogenous factors beforehand exo.facs <- c("C", "F", "T", "V") # parsimonious solution with contradictions included HK.sol <- eQMC(d.partybans, outcome = "PB{1}", exo.facs = exo.facs, incl.cut0 = 0.4, sol.type = "ps+", details = TRUE) HK.sol # which are the two countries in T{2} but not PB{1}? rownames(d.partybans[d.partybans\$T == 2 & d.partybans\$PB != 1, ]) # QCA with multiple outcomes from multivalent variables #------------------------------------------------------ d.mmv <- data.frame(A = c(2,0,0,1,1,1,2,2), B = c(2,2,2,2,1,1,0,0), C = c(0,1,0,0,0,2,1,0), D = c(2,1,2,2,3,1,3,0), E = c(3,2,3,3,0,1,3,2), row.names = letters[1:8]) head(d.mmv) mmv.s <- eQMC(d.mmv, outcome = c("D{2}", "E{3}")) mmv.s # use quotes with curly-bracket notation to access solution component print(mmv.s\$"E{3}", details = TRUE, show.cases = TRUE) # negation of outcome from multivalent factor is disjunction of all other # levels; high under-determination (18 models) mmv.s <- eQMC(d.mmv, outcome = "E{3}", neg.out = TRUE) mmv.s # causal chains with QCA (Thiem 2015); data from Baumgartner (2009) #----------------------------------------------------------------------------- d.Bau <- data.frame( U = c(1,1,1,1,0,0,0,0), D = c(1,1,0,0,1,1,0,0), L = c(1,1,1,1,1,1,0,0), G = c(1,0,1,0,1,0,1,0), E = c(1,1,1,1,1,1,1,0), row.names = letters[1:8]) head(d.Bau) # with multiple outcomes, no solution details are printed; # "causal-chain structure": (D + U <=> L) * (G + L <=> E) # "common-cause structure": (D + U <=> L) * (G + D + U <=> E) Bau.cna <- eQMC(d.Bau, outcome = names(d.Bau), relation = "sufnec") Bau.cna # get the truth table, solution details and case names for outcome "E" print(Bau.cna\$E, details = TRUE, show.cases = TRUE) # examples relating to QCA method evaluation #------------------------------------------- # # is the conservative solution (QCA-CS) really "conservative"? #------------------------------------------------------------- # Ragin (2008, 173): "The complex [conservative] solution [...] does not # permit any counterfactual cases and thus no simplifying assumptions # regarding combinations of conditions that do not exist in the data."; # the conservative solution is "[c]onservative because [...] the # researcher [...] is exclusively guided by the empirical information # at hand" (Schneider and Wagemann 2012, 162) # # in fact, QCA-CS makes extremely strong assumptions on ALL remainders; # QCA-CS assumes every remainder exists at least 'freq.cut' times, # and occurs with the negation of the outcome more than # 'freq.cut' * (1 - 'incl.cut1') times # create a test data-set 'CS' with 32 cases and randomly assign values # on the endogenous factor 'Z' CS <- data.frame(mintermMatrix(rep(2,5))) CS\$Z <- sample(0:1, 2^5, replace = TRUE) # randomly draw 20 cases to create a limitedly diverse data-set 'CS.LD' # and turn all 12 remainder minterms into observations that occur with # 'Z = 0' in original data-set 'CS' CS.LD <- CS[sample(1:2^5, 20), ] change <- as.numeric(setdiff(rownames(CS), rownames(CS.LD))) CS\$Z[change] <- 0 # create the (conservative) solutions for 'CS' and 'CS.LD' CS.sol <- eQMC(CS, outcome = "Z") CS.LD.sol <- eQMC(CS.LD, outcome = "Z", sol.type = "cs") # test whether the two solutions are identical identical(unlist(CS.LD.sol\$solution), unlist(CS.sol\$solution)) # both solutions are identical, for two datasets that do not allow the same # causal inferences to be made; this indicates that QCA-CS draws causal inferences # beyond what the data warrants; the lower the diversity index (ratio of non-remainder # minterms to all minterms), the stronger the assumptions QCA-CS makes ```