knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", dev = "png", dpi = 500, fig.align = "center", knitr::opts_chunk$set(comment = NA) ) library(ggplot2) library(BGGM)
The R
package BGGM provides tools for making Bayesian inference in
Gaussian graphical models [GGM, @williams2020bggm]. The methods are organized around
two general approaches for Bayesian inference: (1) estimation and (2) hypothesis
testing. The key distinction is that the former focuses on either the posterior or posterior
predictive distribution [@Gelman1996a; see section 5 in @rubin1984bayesianly], whereas the
latter focuses on model comparison with the Bayes factor [@Jeffreys1961; @Kass1995].
To install the latest release version (2.1.1
) from CRAN use
install.packages("BGGM")
The current developmental version can be installed with
if (!requireNamespace("remotes")) { install.packages("remotes") } remotes::install_github("donaldRwilliams/BGGM")
There are automatic checks for BGGM. However, that only checks for Linux and BGGM is built on Windows. The most common installation errors occur on OSX. An evolving guide to address these issues is provided in the Troubleshoot Section.
The methods in BGGM build upon existing algorithms that are well-known in the literature. The central contribution of BGGM is to extend those approaches:
Bayesian estimation with the novel matrix-F prior distribution [@Mulder2018]
Bayesian hypothesis testing with the matrix-F prior distribution [@Williams2019_bf]
Comparing Gaussian graphical models [@Williams2019; @williams2020comparing]
Extending inference beyond the conditional (in)dependence structure [@Williams2019]
Posterior uncertainty intervals for the partial correlations
The computationally intensive tasks are written in c++
via the R
package Rcpp [@eddelbuettel2011rcpp] and the c++
library Armadillo [@sanderson2016armadillo]. The Bayes factors are computed with the R
package BFpack [@mulder2019bfpack]. Furthermore, there are plotting functions
for each method, control variables can be included in the model (e.g., ~ gender
),
and there is support for missing values (see bggm_missing
).
Continuous: The continuous method was described in @Williams2019. Note that this is based on the customary Wishart distribution.
Binary: The binary method builds directly upon @talhouk2012efficient that, in turn, built upon the approaches of @lawrence2008bayesian and @webb2008bayesian (to name a few).
Ordinal: The ordinal methods require sampling thresholds. There are two approach included in BGGM. The customary approach described in @albert1993bayesian (the default) and the 'Cowles' algorithm described in @cowles1996accelerating.
Mixed: The mixed data (a combination of discrete and continuous) method was introduced in @hoff2007extending. This is a semi-parametric copula model (i.e., a copula GGM) based on the ranked likelihood. Note that this can be used for only ordinal data (not restricted to "mixed" data).
There are several examples in the Vignettes section.
It is common to have some combination of continuous and discrete (e.g., ordinal, binary, etc.) variables. BGGM (as of version 2.0.0
) can readily be used for these kinds of data. In this example, a model is fitted for the gss
data in BGGM.
The data are first visualized with the psych package, which readily shows the data are "mixed".
# dev version library(BGGM) library(psych) # data Y <- gss # histogram for each node psych::multi.hist(Y, density = FALSE)
A Gaussian copula graphical model is estimated as follows
fit <- estimate(Y, type = "mixed")
type
can be continuous
, binary
, ordinal
, or mixed
. Note that type
is a misnomer, as the data can consist of only ordinal variables (for example).
The estimated relations are summarized with
summary(fit) #> BGGM: Bayesian Gaussian Graphical Models #> --- #> Type: mixed #> Analytic: FALSE #> Formula: #> Posterior Samples: 5000 #> Observations (n): 464 #> Nodes (p): 7 #> Relations: 21 #> --- #> Call: #> estimate(Y = Y, type = "mixed") #> --- #> Estimates: #> Relation Post.mean Post.sd Cred.lb Cred.ub #> INC--DEG 0.463 0.042 0.377 0.544 #> INC--CHI 0.148 0.053 0.047 0.251 #> DEG--CHI -0.133 0.058 -0.244 -0.018 #> INC--PIN 0.087 0.054 -0.019 0.196 #> DEG--PIN -0.050 0.058 -0.165 0.062 #> CHI--PIN -0.045 0.057 -0.155 0.067 #> INC--PDE 0.061 0.057 -0.050 0.175 #> DEG--PDE 0.326 0.056 0.221 0.438 #> CHI--PDE -0.043 0.062 -0.162 0.078 #> PIN--PDE 0.345 0.059 0.239 0.468 #> INC--PCH 0.052 0.052 -0.052 0.150 #> DEG--PCH -0.121 0.056 -0.228 -0.012 #> CHI--PCH 0.113 0.056 0.007 0.224 #> PIN--PCH -0.080 0.059 -0.185 0.052 #> PDE--PCH -0.200 0.058 -0.305 -0.082 #> INC--AGE 0.211 0.050 0.107 0.306 #> DEG--AGE 0.046 0.055 -0.061 0.156 #> CHI--AGE 0.522 0.039 0.442 0.594 #> PIN--AGE -0.020 0.054 -0.122 0.085 #> PDE--AGE -0.141 0.057 -0.251 -0.030 #> PCH--AGE -0.033 0.051 -0.132 0.063 #> ---
The summary can also be plotted
plot(summary(fit))
The graph is selected and plotted with
E <- select(fit) plot(E, node_size = 12, edge_magnify = 5)
The Bayes factor testing approach is readily implemented by changing estimate
to explore
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.