BGGM-package: BGGM: Bayesian Gaussian Graphical Models
In donaldRwilliams/BGGM: Bayesian Gaussian Graphical Models

BGGM-package

R Documentation

BGGM: Bayesian Gaussian Graphical Models

Description

The R package BGGM provides tools for making Bayesian inference in Gaussian graphical models (GGM). The methods are organized around two general approaches for Bayesian inference: (1) estimation \insertCiteWilliams2019BGGM and (2) hypothesis testing \insertCiteWilliams2019_bfBGGM. The key distinction is that the former focuses on either the posterior or posterior predictive distribution, whereas the latter focuses on model comparison with the Bayes factor.

The methods in BGGM build upon existing algorithms that are well-known in the literature. The central contribution of BGGM is to extend those approaches:

Bayesian estimation with the novel matrix-F prior distribution \insertCiteMulder2018BGGM.
- Estimation estimate.
Bayesian hypothesis testing with the novel matrix-F prior distribution \insertCiteMulder2018BGGM.
- Exploratory hypothesis testing explore.
- Confirmatory hypothesis testing confirm.
Comparing GGMs \insertCitewilliams2020comparingBGGM
- Partial correlation differences ggm_compare_estimate.
- Posterior predictive check ggm_compare_ppc.
- Exploratory hypothesis testing ggm_compare_explore.
- Confirmatory hypothesis testing ggm_compare_confirm.
Extending inference beyond the conditional (in)dependence structure
- Predictability with Bayesian variance explained \insertCitegelman_r2_2019BGGM predictability.
- Posterior uncertainty in the partial correlations estimate.
- Custom Network Statistics roll_your_own.

Furthermore, the computationally intensive tasks are written in c++ via the R package Rcpp \insertCiteeddelbuettel2011rcppBGGM and the c++ library Armadillo \insertCitesanderson2016armadilloBGGM, there are plotting functions for each method, control variables can be included in the model, and there is support for missing values bggm_missing.

Supported Data Types:

Continuous: The continuous method was described \insertCite@in @Williams2019_bf;textualBGGM.
Binary: The binary method builds directly upon \insertCite@in @talhouk2012efficient;textualBGGM, that, in turn, built upon the approaches of \insertCitelawrence2008bayesian;textualBGGM and \insertCitewebb2008bayesian;textualBGGM (to name a few).
Ordinal: Ordinal data requires sampling thresholds. There are two approach included in BGGM: (1) the customary approach described in \insertCite@in @albert1993bayesian;textualBGGM (the default) and the 'Cowles' algorithm described in \insertCite@in @cowles1996accelerating;textualBGGM.
Mixed: The mixed data (a combination of discrete and continuous) method was introduced \insertCite@in @hoff2007extending;textualBGGM. This is a semi-parametric copula model (i.e., a copula GGM) based on the ranked likelihood. Note that this can be used for data consisting entirely of ordinal data.

Additional Features:

The primary focus of BGGM is Gaussian graphical modeling (the inverse covariance matrix). The residue is a suite of useful methods not explicitly for GGMs:

Bivariate correlations for binary (tetrachoric), ordinal (polychoric), mixed (rank based), and continuous (Pearson's) data zero_order_cors.
Multivariate regression for binary (probit), ordinal (probit), mixed (rank likelihood), and continous data (estimate).
Multiple regression for binary (probit), ordinal (probit), mixed (rank likelihood), and continuous data (e.g., coef.estimate).

Note on Conditional (In)dependence Models for Latent Data:

All of the data types (besides continuous) model latent data. That is, unobserved (latent) data is assumed to be Gaussian. For example, a tetrachoric correlation (binary data) is a special case of a polychoric correlation (ordinal data). Both capture relations between "theorized normally distributed continuous latent variables" (Wikipedia). In both instances, the corresponding partial correlation between observed variables is conditioned on the remaining variables in the latent space. This implies that interpretation is similar to continuous data, but with respect to latent variables. We refer interested users to \insertCite@page 2364, section 2.2, in @webb2008bayesian;textualBGGM.

High Dimensional Data?

BGGM was built specifically for social-behavioral scientists. Of course, the methods can be used by all researchers. However, there is currently not support for high-dimensional data (i.e., more variables than observations) that are common place in the genetics literature. These data are rare in the social-behavioral sciences. In the future, support for high-dimensional data may be added to BGGM.