# boolean3-package: Modeling Causal Complexity

Description Details Note Author(s) References See Also Examples

### Description

Boolean binary response models are a family of partial-observability binary response models designed to permit researchers to model causal complexity, or multiple causal “paths” to a given outcome.

### Details

Boolean permits estimation of Boolean binary response models (see Braumoeller 2003 for derivation), which are a family of partial-observability n-variate models designed to permit researchers to model causal complexity, or multiple causal “paths” to a given outcome. The various “paths” are modeled as latent dependent variables that are multiplied together in a manner determined by the logic of their (Boolean) interaction. If, for example, we wanted to model a situation in which diet OR smoking causes heart failure, we would use one set of independent variables (caloric intake, fat intake, etc.) to predict the latent probability of diet-related coronary failure (y1^*), use another set of variables (cigarettes smoked per day, exposure to second-hand smoke, etc.) to predict the latent probability of smoking-related coronary failure (y2^*), and model the observed outcome (y, or coronary failure) as a function of the Boolean interaction of the two: \Pr(y=1) = 1-([1-y1^*] \times [1-y2^*]). Independent variables that have an impact on both latent dependent variables can be included in both paths. Any combination of any number of ANDs and ORs can be modeled, though the procedure becomes exponentially more data-intensive as the number of interactions increases.

 Package: boolean3 Type: Package Version: 3.1.6 Date: 2014-11-15 License: GPL-3 LazyLoad: yes

### Note

boolean3 was developed by Jason W. Morgan under the direction of Bear Braumoeller with support from The Ohio State University's College of Social and Behavioral Sciences. The package represents a significant re-write of the original boolean implementation developed by Bear Braumoeller, Ben Goodrich, and Jacob Kline. Please see the release notes and accompanying documentation for details regarding the many changes made in this version.

### Author(s)

Jason W. Morgan (morgan.746@osu.edu)

### References

Braumoeller, Bear F. (2003) “Causal Complexity and the Study of Politics.” Political Analysis 11(3): 209–233.

See boolprep for model setup and boolean for estimation.

### Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 ## Generate some fake data require(mvtnorm) set.seed(12345) N <- 2000 Df <- cbind(1, rmvnorm(N, mean=rep(0, 5))) ## Set coefficients beta.a <- c(-2.00, 0.33, 0.66, 1.00) beta.b <- c(0.00, 1.50, -0.25) ## Generate path probabilities following a normal model. y.a <- as.vector(pnorm(tcrossprod(beta.a, Df[, 1:4]))) y.b <- as.vector(pnorm(tcrossprod(beta.b, Df[, c(1, 5, 6)]))) ## AND and OR-model or <- function(x, y) { x + y - x * y } and <- function(x, y) { x * y } y.star.OR <- or(y.a, y.b) y.star.AND <- and(y.a, y.b) ## Observed responses y.OR <- rbinom(N, 1, y.star.OR) y.AND <- rbinom(N, 1, y.star.AND) ## Set up data.frame for estimation Df <- cbind(1, Df) Df <- as.data.frame(Df) Df[,1] <- y.OR Df[,2] <- y.AND names(Df) <- c("y.OR", "y.AND", "x1", "x2", "x3", "x4", "x5") ## Before estimating, boolean models need to be specified using the ## boolprep function. ## OR model, specified to use a probit link for each causal path. This ## matches the data generating process above. mod.OR <- boolprep(y.OR ~ (a | b), a ~ x1 + x2 + x3, b ~ x4 + x5, data = Df, family=list(binomial("probit"))) ## IF you really want to, it's also possible to specify a different ## link function for each causal path. These should be in the same ## order as specified in the model formula. mod.OR2 <- boolprep(y.OR ~ (a | b), a ~ x1 + x2 + x3, b ~ x4 + x5, data = Df, family=list(binomial("probit"), binomial("logit"))) ## Fit the prepared model using the nlminb optimizer (the default). (fit.OR <- boolean(mod.OR, method="nlminb", control=list(trace=1))) ## Multiple optimizers can be specified in a single call to boolean. ## Here we fit with the nlm and nlminb optimizers. (fit1.OR <- boolean(mod.OR, method=c("nlm", "nlminb"))) ## Re-fit, with BFGS and a higher maximum number of iterations. All ## of the options that go along with nlm(), optim(), and genoud() should ## be transparently passable through boolean. (fit2.OR <- boolean(mod.OR, method="BFGS", control = list(maxit = 500))) ## Induce a convergence warning message. (fit3.OR <- boolean(mod.OR, method="BFGS", control = list(maxit = 5))) ## Not run: ## Now estimate model with genoud, a genetic optimizer. This has the ## capability of using multiple processors via parallel. (fit6.OR <- boolean(mod.OR, method="genoud", cluster=c("localhost", "localhost"), print.level=2)) ## The default SANN optimizer is also available. (fit7.OR <- boolean(mod.OR, method="SANN")) ## End(Not run) ## The fit is stored as "model.fit", within the boolean object. str(fit.OR$model.fit) ## Create a summary object, saving and printing it. Then take a look at ## the objects stored in the summary object. (smry <- summary(fit.OR)) str(smry) ## Extract log-likelihood and coefficient vector. logLik(fit.OR) coef(fit.OR) ## Not run: ## Display the contours of the likelihood given a change the value of ## the coefficients. Despite the function name, these are not true ## profile likelihoods as they hold all other coefficients fixed at ## their MLE. (prof <- boolprof(fit.OR)) ## Extract the plots for x1_a and x4_b. plot(prof, y = c("x1_a", "x4_b")) plot(prof, y = c(1, 3), scales = list(y = list(relation = "free"))) ## You can also use variable or index matching with boolprof to select ## particular covariates of interest. boolprof(fit.OR, vars = c(1, 3)) boolprof(fit.OR, vars = c("x1_a", "x4_b")) ## Plots of predicted probabilities are available through boolprob. ## With boolprob, either vars or newdata *must* be specified. boolprob(fit.OR, vars = c("x1_a", "x4_b")) boolprob(fit.OR, vars = c(2, 3, 4, 6)) ## Specifying conf.int = TRUE produces simulated confidence intervals. ## The number of samples to pull from the distribution of the estimated ## coefficients is controlled by n; n=100 is default. This can take a ## while. (prob <- boolprob(fit.OR, vars = c(2, 3, 4, 6), n = 1000, conf.int = TRUE)) ## Choose a different method estimate upon which to base the estimates. (prob <- boolprob(fit1.OR, method="nlm", vars=c(2, 3, 4, 6), n=1000, conf.int=TRUE)) ## As with the other components of the model, you can extract the ## predicted probabilities. str(prob) prob$est ## Bootstrapping is also possible, and is the recommended method of ## making inferences. The boolbool function uses a simple sampling scheme: ## it resamples repeatedly from the provided data.frame. More the complex ## data structures with, for example, clustering, need to be dealt with ## manually. (bs <- boolboot(fit, n=10)) ## boolboot supports bootstrapping across multiple processors. (bs <- boolboot(fit, n=100, cluster=c("localhost", "localhost"))) ## End(Not run) 

Search within the boolean3 package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.