README.md
In rjbgoudie/structmcmc: Structural inference for Bayesian Networks and Variable Selection.

structmcmc

structmcmc is a set of tools for performing structural inference for Bayesian Networks using MCMC in R, a free software environment for statistical computing and graphics.

The widely-used MC3 algorithm (Madigan & Raftery, 1995) is implemented, as well as a number of variants of the algorithm. The MC3 algorithm is a Metropolis-Hastings sampler for which the target distribution is the posterior distribution of Bayesian networks. Tools for exact solutions are also available, but for networks with more than, say, 6 nodes, these will be prohibitively slow.

The package includes an implementation of the fast Gibbs sampler introduced in Goudie & Mukherjee, 2016.

The implementation allows the local conditional distributions to be multinomial or Gaussian, using standard priors. Arbitrary structural priors for the Bayesian network can be specified. The main difficulty in sampling Bayesian networks efficiently is ensuring the acyclicity constraint is not violated. The package implements the cycle-checking methods introduced by King & Sagert (2002), which is an alternative to the method introduced by Giudici & Castelo (2003). To enable convergence to be assessed, a number of tools for creating diagnostic plots are included.

Interfaces to a number of other R packages for Bayesian networks are available, including deal (hill-climbing and heuristic search), bnlearn (a number of constraint-based and score-based algorithms) and pcalg (PC-algorithm). An interface to gRain is also included to allow its probability propagation routines to be used easily.

View script as file

Each random variable has a Multinomial distribution, with the conjugate Dirichlet priors.

Data must be supplied as a data.frame with p columns (corresponding to p random variables) and n columns (corresponding to the n samples). Each column must be a factor variable.

x1 <- factor(c("a", "a", "g", "c", "c", "a", "g", "a", "a"))
x2 <- factor(c(2, 2, 4, 3, 1, 4, 4, 4, 1))
x3 <- factor(c(FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE))
x <- data.frame(x1 = x1, x2 = x2, x3 = x3)

Draw samples from the posterior using MC3.

set.seed(1234)
initial <- bn(c(), c(), c())
priorgraph <- bn(c(), c(1), c(2))
prior <- priorGraph(priorgraph, 0.5)
mcmc <- posterior(data = x, method = "mc3", prior = prior,
                  nSamples = 10000, nBurnin = 1000, initial = initial)

Compute and plot estimated edge probabilities given by the MCMC run

epmcmc <- ep(mcmc)
levelplot(epmcmc)

Since this is a problem with p = 3, we can compute the posterior edge probabilies by exhaustive enumeration. This is only feasible for p <= 6 or so.

exact <- posterior(x, "exact", prior = prior)
epexact <- ep(exact)
levelplot(epexact)

Comparing multiple MCMC runs

mcmc2 <- posterior(data = x, method = "mc3", prior = prior,
                   nSamples = 10000, nBurnin = 1000, initial = initial)
epmcmc2 <- ep(mcmc2)
levelplot(epmcmc2)

Compare the final edge probabilities between runs

splom(bnpostmcmc.list(mcmc, mcmc2))
levelplot(ep.list(exact = epexact, mcmc = epmcmc))

Plot how the cumulative edge probabilities change as samples are drawn.

splom(cumep(bnpostmcmc.list(mcmc, mcmc2)))

Plot how the moving averaging edge probabilities change as samples are drawn.

splom(mwep(bnpostmcmc.list(mcmc, mcmc2)))

Plot how the cumulative total variance distance changes as samples are drawn

exactgp <- gp(exact)
splom(cumtvd(exactgp, bnpostmcmc.list(mcmc, mcmc2)))

View script as file

Each random variable is assumed to be Normally-distributed, with a G-prior.

Data must be supplied as a matrix with p columns (corresponding to p random variables) and n columns (corresponding to the n samples). Each column must be a numeric variable.

x1 <- rnorm(20)
x2 <- rnorm(20)
x3 <- rnorm(20)
x <- matrix(c(x1, x2, x3), ncol = 3)

Draw samples from the posterior using MC3.

set.seed(1234)
initial <- bn(c(), c(), c())
mcmc <- posterior(data = x, method = "mc3",
                  logScoreFUN = logScoreNormalFUN(),
                  nSamples = 10000, nBurnin = 1000, initial = initial)

Compute and plot estimated edge probabilities given by the MCMC run

epmcmc <- ep(mcmc)
levelplot(epmcmc)

Since this is a problem with p = 3, we can compute the posterior edge probabilies by exhaustive enumeration. This is only feasible for p <= 6 or so.

exact <- posterior(x, "exact", logScoreFUN = logScoreNormalFUN())
epexact <- ep(exact)
levelplot(epexact)

Comparing multiple MCMC runs

mcmc2 <- posterior(data = x, method = "mc3",
                   logScoreFUN = logScoreNormalFUN(),
                   nSamples = 10000, nBurnin = 1000, initial = initial)
epmcmc2 <- ep(mcmc2)
levelplot(epmcmc2)

Compare the final edge probabilities between runs

splom(bnpostmcmc.list(mcmc, mcmc2))
levelplot(ep.list(exact = epexact, mcmc = epmcmc))

Plot how the cumulative edge probabilities change as samples are drawn.

splom(cumep(bnpostmcmc.list(mcmc, mcmc2)))

Plot how the moving averaging edge probabilities change as samples are drawn.

splom(mwep(bnpostmcmc.list(mcmc, mcmc2)))

Plot how the cumulative total variance distance changes as samples are drawn

exactgp <- gp(exact)
splom(cumtvd(exactgp, bnpostmcmc.list(mcmc, mcmc2)))

Download the current version, and unzip the file. Then install in R using the following, where rjbgoudie-structmcmc-XXXXX is the name of the unzipped directory/folder, and path/to/rjbgoudie-structmcmc-XXXXX is the path to this folder.

install.packages("path/to/rjbgoudie-structmcmc-XXXXX", repos = NULL, type = "source")

The package depends on parental, which provides a very lightweight directed graph object and basic manipulation tools for R.

Also required are lattice, and grid, both of which are included with R.

gRain, nnet, reshape, zoo are also recommended, and can be installed from CRAN, e.g. using

install.packages("reshape")

Goudie, R. J. B., & Mukherjee, S. (2016). A Gibbs Sampler for Learning DAGs. Journal of Machine Learning Research, 17(30), 1-39. http://jmlr.org/papers/v17/14-486.html

rjbgoudie/structmcmc documentation built on Nov. 3, 2020, 3:41 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rjbgoudie/structmcmc
Structural inference for Bayesian Networks and Variable Selection.

README.md
In rjbgoudie/structmcmc: Structural inference for Bayesian Networks and Variable Selection.

structmcmc

Basic operation, discrete data

Basic operation, continuous data

Installation

How to cite

R Package Documentation

Browse R Packages

We want your feedback!

rjbgoudie/structmcmc Structural inference for Bayesian Networks and Variable Selection.

README.md In rjbgoudie/structmcmc: Structural inference for Bayesian Networks and Variable Selection.

structmcmc

Basic operation, discrete data

Basic operation, continuous data

Installation

How to cite

R Package Documentation

Browse R Packages

We want your feedback!

rjbgoudie/structmcmc
Structural inference for Bayesian Networks and Variable Selection.

README.md
In rjbgoudie/structmcmc: Structural inference for Bayesian Networks and Variable Selection.