This package implements greedy search for a class of graphical models called bow-free acyclic path diagrams (BAPs). For more information and theoretical results, see the paper https://arxiv.org/abs/1508.01717

Generally, the functionality provided by this package can be split into the following categories:

Data Structures

Graphs

A graph is represented via its adjacency matrix $M$ in a similar way as for ggm::makeMG. In particular:

library(greedyBAPs)
M <- t(cbind(
  c(0, 1, 100),
  c(0, 0, 0),
  c(0, 0, 0)
))
plotBAP(M)

Parameters

A BAP model is as a linear recursive structural equation model with (correlated) Gaussian noise. If $\mathbf{X}$ is the variable vector, the BAP can be written as:

$\mathbf{X} = B^T \mathbf{X} + \epsilon$

where $({B})_{ij}$ is the edge weight of edge $i \rightarrow j$, and $\epsilon$ has a $\mathcal{N}(0, \Omega)$ distribution.

For the example above, a valid parametrisation would be:

B <- t(cbind(
  c(0, 0.3, 0),
  c(0, 0, 0),
  c(0, 0, 0)
))
Omega <- t(cbind(
  c(3, 0, 1.5),
  c(0, 2, 0),
  c(0, 0, 1)
))

Data

An $n$-sample dataset for a $p$-dimensional BAP model has to be in the form of an $n$-by-$p$ matrix. For example, we could fit the following dataset to the model above (it would be a poor fit, because the data was not generated from this model):

n <- 100
p <- 3
data <- matrix(rnorm(n * p), n, p)

Examples

Randomly generating BAPs (uniformly), parameters, data

set.seed(1)
bap <- generateUniformBAP(p = 4, N = 1)[[1]]
params <- generateBAPParameters(bap)
data <- generateData(n = 100, params = params)
plotBAP(bap)
pairs(data)

Running greedy search and visualising scores

res <- greedySearch(cov(data), nrow(data), n.restarts = 10, mc.cores = 4)
plotScoreCurves(res$all.scores, res$times)

Causal effects simulation experiment

res <- causalEffectsSimulation(N = 10, p = 5, n = 1000, mc.cores = 4, n.restarts = 10, max.in.degree = 2, max.steps = 100, max.iter.ricf = 10, equivalent.eps = 1e-10)
roc <- plotROCCurve(res$CE.gt, res$CE.greedy)

Genomic data

This package ships with a dataset from Sachs et al. (2015)^[http://science.sciencemag.org/content/308/5721/523], to which DAGs can be fit as follows:

data.sachs <- as.matrix(flowCytometry[flowCytometry$source == unique(flowCytometry$source)[1], 1:11])
res <- greedySearch(cov(data.sachs), nrow(data.sachs), n.restarts = 10, mc.cores = 4, dags.only = TRUE)
plotScoreCurves(res$all.scores, res$times)


cnowzohour/greedyBAPs documentation built on May 13, 2019, 1:36 a.m.