query: Function to query MCMC samples generated by mcmcabn

View source: R/query.R

queryR Documentation

Function to query MCMC samples generated by mcmcabn

Description

The function allows users to perform structural queries over MCMC samples produced by mcmcabn.

Usage

query(mcmcabn = NULL, formula = NULL)
                 

Arguments

mcmcabn

object of class mcmcabn.

formula

formula statement or adjacency matrix to query the MCMC samples, see details. If this argument is NULL, then the average arc-wise frequencies is reported.

Details

The query can be formulated using an adjacency matrix or a formula-wise expression.

The adjacency matrix should be squared of dimension equal to the number of nodes in the networks. Their entries should be either 1, 0, or -1. The 1 indicates the requested arcs, the -1 the excluded, and the 0 all other entries that are not subject to query. The rows indicated the set of parents of the index nodes. The order of rows and column should be the same as the one used in the mcmcabn() function in the data.dist argument.

The formula statement has been designed to ease querying over the MCMC sample. It allows user to make complex queries without explicitly writing an adjacency matrix (which can be painful when the number of variables is large). The formula argument can be provided using a formula alike: ~ node1|parent1:parent2 + node2:node3|parent3. The formula statement has to start with ‘~'. In this example, node1 has two parents (parent1 and parent2). node2 and node3 have the same parent3. The parents’ names have to match those given in name exactly. ':' is the separator between either children or parents, '|' separates children (left side) and parents (right side), '+' separates terms, '.' replaces all the variables in name. Additional, when one wants to exclude an arc simply put '-' in front of that statement. Then a formula alike: ~ -node1|parent1 exclude all DAGs that have an arc between parent1 and node1.

If the formula argument is not provided, the function returns the average support of all individual arcs using a named matrix.

Value

A frequency for the requested query. Alternatively a matrix with arc-wise frequencies.

Author(s)

Gilles Kratzer

References

Kratzer, G., Furrer, R. "Is a single unique Bayesian network enough to accurately represent your data?". arXiv preprint arXiv:1902.06641.

Lauritzen, S., Spiegelhalter, D. (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)". Journal of the Royal Statistical Society: Series B, 50(2):157–224.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1–22. doi:http://dx.doi.org/10.18637/jss.v035.i03.

Examples

## Example from the asia dataset from Lauritzen and Spiegelhalter (1988)
## provided by Scutari (2010)
data("mcmc_run_asia")

## Return a named matrix with individual arc support
query(mcmcabn = mcmc.2par.asia)

## What is the probability of LungCancer node being children of the Smoking node?
query(mcmcabn = mcmc.2par.asia, formula = ~LungCancer|Smoking)

## What is the probability of Smoking node being parent of
## both LungCancer and Bronchitis node?
query(mcmcabn = mcmc.2par.asia,
      formula = ~ LungCancer|Smoking+Bronchitis|Smoking)

## What is the probability of previous statement, when there
## is no arc from Smoking to Tuberculosis and from Bronchitis to XRay?
query(mcmcabn = mcmc.2par.asia,
      formula = ~LungCancer|Smoking + Bronchitis|Smoking -
                  Tuberculosis|Smoking - XRay|Bronchitis)

mcmcabn documentation built on Sept. 28, 2023, 5:08 p.m.