knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(JAGStree)
In many real-world applications, individuals or subjects may appear in multiple data sets, setting the stage for the opportunity to synthesize evidence to use all available data for inference. These relational structure of these data sources often admits a graphical representation, where nodes may be represented by data sources and edges representing the relationship between them (e.g., directed edges for sub-groupings or representation of time). One such representation is as a tree (i.e., a cycle-free graph with directed edges). Performing inference on this structure can be achieved using MCMC methods to obtain estimates of posterior distributions of a Bayesian hierarchical model.
This package is geared towards one subset of cases in which this general framework holds [@flynncomp].
In particular, we assume a number of related data sources exist with a known relational structure described by a tree. We also assume that nodes are associated with integer counts, which are multinomially distributed from the parent node, and branches are associated with a probability, which are Dirichlet distributed among a given sibling group. Then, given a dataframe representing the tree structure, the makeJAGStree()
function will generate appropriate MCMC modeling code to be used with 'JAGS', to run a Bayesian hierarchical model.
The data frame must contain only two columns:
'from' (string, node label)
'to' (string, node label)
The 'from' and 'to' labels encode the tree structure, describing the directed edge between nodes.
This makeJAGStree
function has three parameters, two of which are required and one which is optional:
data
: This required argument must be a data frame with at least two columns: from and to. Together, these columns specify the tree structure. Other columns may also be included; the AutoWMM
package can additionally be used to render tree diagrams and perform root node population size estimation with the weighted multiplier method, if appropriate [@flynnmethods, @flynnapplication]
prior
: This optional argument specifies the prior chosen for the root node population size. The default is lognormal
, while a uniform
prior is also an option.
filename
: This required argument is used to name the 'JAGS' code output file. It must end in .mod
or .txt
for correct functionality.
The Bayesian model implied behind this 'JAGS' code is described at length elsewhere [@flynncomp], and is based on previously developed methodology [@flynnmethods]. A more extensive application to real-world data can be found in [@flynnapplication].
The functionality of the package is best demonstrated using simple trees. The AutoWMM
package can also be used to help render and visualize each of these trees:
data <- data.frame("from" = c("Z", "Z", "A", "A"), "to" = c("A", "B", "C", "D"), "Estimate" = c(4, 34, 9, 1), "Total" = c(11, 70, 10, 10), "Count" = c(NA, 500, NA, 50), "Population" = c(FALSE, FALSE, FALSE, FALSE), "Description" = c("First child of the root", "Second child of the root", "First grandchild", "Second grandchild")) # optional use of the AutoWMM package to show tree structure Sys.setenv("RGL_USE_NULL" = TRUE) library(AutoWMM) library(DiagrammeR) tree <- makeTree(data) drawTree(tree)
The function can be directly applied to the data, and will produce a .mod
or .txt
file, depending on which is specified. A uniform prior can also be chosen for the root node, with parameters that will be specified when running 'JAGS':
makeJAGStree(data1, filename=file.path(tempdir(), "data1_JAGSscript.mod")) makeJAGStree(data1, filename=file.path(tempdir(), "data1_JAGSscript.txt", prior="uniform"))
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.