BuildNodeManifest: Builds a table describing a set of Pnodes
In ralmond/Peanut: Parameterized Bayesian Networks, Abstract Classes

BuildNodeManifest

R Documentation

Builds a table describing a set of Pnodes

Description

A node manifest is a table where each line describes one state of a node in a Bayesian network. As a node manifest may contain nodes from more than one network, the key for the table is the first two columns: “Model” and “NodeName”. The primary purpose is that this can be given to a Node Warehouse to create nodes on demand.

Usage

BuildNodeManifest(Pnodelist)

Arguments

Pnodelist

A list of Pnode objects from which the table will be built.

Details

A node manifest is a table (data frame) which describes a collection of nodes. It contains mostly meta-data about the nodes, and not the information about the relaitonships between the nodes which is contained in the Q-matrix (Pnet2Qmat) or the \Omega-Matrix (Pnet2Omega). The role of the node manifest is to be used as to create a Node Warehouse which is an argument to the Qmat2Pnet and Omega2Pnet commands, creating nodes as they are referenced. Hence it contains the information about the node which is not part of the Q or \Omega matrix.

The Q-matrix can span multiple Bayesian networks. The same variable can appear with the same name but slightly different definitions in two different networks. Consequently, the key for this table is the “Model” and “NodeName” columns (usually the first two). The function WarehouseData when applied to a node warehouse should have a key of length 2 (model and node name) and will return multiple lines, one line corresponding to each state of the data frame.

The columns “ModelHub”, “NodeTitle”, “NodeDescription” and “NodeLabels” provide meta-data about the node. They may be missing empty strings, indicating that meta-data is unavailable.

The columns “Nstates” and “StateName” are required. The number of states should be an integer (2 or greater) and there should be as many rows with this model and node name as there are states. Each should have a unique value for “StateName”. The “StateTitle”, “StateDescription” and “StateValue” are optional, although if the variable is to be used as a parent variable, it is strongly recommended to set the state values.

Value

An object of class data.frame with the following columns.

Node-level Key Fields:

`Model`	A character value giving the name of the Bayesian network to which this node belongs. Corresponds to the value of `PnodeNet`.
`NodeName`	A character value giving the name of the node. All rows with the same value in the model and node name columns are assumed to reference the same node. Corresponds to the value of `PnodeName`.

Node-level Fields:

`ModelHub`	If this is a spoke model (meant to be attached to a hub) then this is the name of the hub model (i.e., the name of the proficiency model corresponding to an evidence model). Corresponds to the value of `PnetHub(PnodeNet(node))`.
`NodeTitle`	A character value containing a slightly longer description of the node, unlike the name this is not generally restricted to variable name formats. Corresponds to the value of `PnodeTitle`.
`NodeDescription`	A character value describing the node, meant for human consumption (documentation). Corresponds to the value of `PnodeDescription`.
`NodeLabels`	A comma separated list of identifiers of sets which this node belongs to. Used to identify special subsets of nodes (e.g., high-level nodes or observeable nodes). Corresponds to the value of `PnodeLabels`.

State-level Key Fields:

`Continuous`	A logical value. If true, the variable will be continuous, with states corresponding to ranges of values. If false, the variable will be discrete, with named states.
`Nstates`	The number of states. This should be an integer greater than or equal to 2. Corresponds to the value of `PnodeNumStates`.
`StateName`	The name of the state. This should be a string value and it should be different for every row within the subset of rows corresponding to a single node. Corresponds to the value of `PnodeStates`.

State-level Fields:

`StateTitle`	A longer name not subject to variable naming restrictions. Corresponds to the value of `PnodeStateTitles`.
`StateDescription`	A human readable description of the state (documentation). Corresponds to the value of `PnodeStateDescriptions`.
`StateValue`	A real numeric value assigned to this state. `PnodeStateValues`. Note that this has different meaning for discrete and continuous variables. For discrete variables, this associates a numeric value with each level, which is used in calculating the `PnodeEAP` and `PnodeSD` functions. In the continuous case, this value is ignored and the midpoint between the “LowerBounds” and “UpperBounds” are used instead.
`LowerBound`	This servers as the lower bound for each partition of the continuous variagle. `-Inf` is a legal value for the first or last row.
`UpperBound`	This is only used for continuous variables, and the value only is needed for one of the states. This servers as the upper bound of range each state. Note the upper bound needs to match the lower bounds of the next state. `Inf` is a legal value for the first or last row.

Logging

BuildNodeManifest uses the flog.logger mechanism to log progress. To see progress messages, use flog.threshold(DEBUG) (or TRACE).

Continuous Variables

Peanut (following Netica) treats continuous variables as discrete variables whose states correspond to ranges of an underlying continuous variable. Unfortunately, this overlays the meaning of PnodeStateValues, and consequently the “StateValue” column.

Discrete Variables. The states of the discrete variables are defined by the “StateName” fields. If values are supplied in “StateValue”, then these values are used in calculating expected a posteriori statistics, PnodeEAP() and PnodeSD(). The “LowerBound” and “UpperBound” fields are ignored.

Continuous Variables. The states of the continuous variable are defined by breaking the range up into a series of intervals. Right now the intervals must be adjacent (the upper bound of one must match the lower bound of the next) and cannot overlap. This is done by supplying a “LowerBound” and “UpperBound” for each state. If the upper and lower bounds do not match, then an error is signaled.

Author(s)

Russell Almond

References

Almond, R. G. (presented 2017, August). Tabular views of Bayesian networks. In John-Mark Agosta and Tomas Singlair (Chair), Bayeisan Modeling Application Workshop 2017. Symposium conducted at the meeting of Association for Uncertainty in Artificial Intelligence, Sydney, Australia. (International) Retrieved from http://bmaw2017.azurewebsites.net/

Examples


## This expression provides an example Node manifest
nodeman1 <- read.csv(system.file("auxdata", "Mini-PP-Nodes.csv",
                                package="Peanut"),
                     row.names=1,stringsAsFactors=FALSE)

## Not run: 
library(PNetica) ## Requires PNetica
sess <- NeticaSession()
startSession(sess)

netpath <- system.file("testnets",package="PNetica")
netnames <- paste(c("miniPP-CM","PPcompEM","PPconjEM","PPtwostepEM",
                  "PPdurAttEM"),"dne",sep=".")

Nets <- ReadNetworks(file.path(netpath,netnames),
                     session=sess)

CM <- Nets[[1]]
EMs <- Nets[-1]

nodeman <- BuildNodeManifest(lapply(NetworkAllNodes(CM),as.Pnode))

for (n in 1:length(EMs)) {
  nodeman <- rbind(nodeman,
                    BuildNodeManifest(lapply(NetworkAllNodes(EMs[[n]]),
                                             as.Pnode)))
}

## Need to ensure that labels are in cannonical order only for the
## purpose of testing
nodeman[,6] <- sapply(strsplit(nodeman[,6],","),
      function(l) paste(sort(l),collapse=","))
nodeman1[,6] <- sapply(strsplit(nodeman1[,6],","),
      function(l) paste(sort(l),collapse=","))

stopifnot(all.equal(nodeman,nodeman1))

## This is the node warehouse for PNetica
Nodehouse <- NNWarehouse(manifest=nodeman1,
                         key=c("Model","NodeName"),
                         session=sess)
phyd <- WarehouseData(Nodehouse,c("miniPP_CM","Physics"))
p3 <- MakePnode.NeticaNode(CM,"Physics",phyd)

attd <- WarehouseData(Nodehouse,c("PPdurAttEM","Attempts"))
att <- MakePnode.NeticaNode(Nets[[5]],"Attempts",attd)

durd <- WarehouseData(Nodehouse,c("PPdurAttEM","Duration"))
dur <- MakePnode.NeticaNode(Nets[[5]],"Duration",durd)


stopSession(sess)


## End(Not run)

ralmond/Peanut documentation built on Sept. 19, 2023, 8:27 a.m.