Omega2Pnet | R Documentation |
An Omega matrix (represented as a data frame) is a structure which
describes a Bayesian network as a series of regressions from the
parent nodes to the child nodes. It actually contains two matrixes,
one giving the structure and the other the regression coefficients.
A skeleton matrix can be constructed through the function
Pnet2Omega
.
Omega2Pnet(OmegaMat, pn, nodewarehouse, defaultRule = "Compensatory",
defaultLink = "normalLink", defaultAlpha = 1, defaultBeta = 0,
defaultLinkScale = 1, defaultPriorWeight=10, debug = FALSE, override =FALSE,
addTvals = TRUE)
OmegaMat |
A data frame containing an Omega matrix (see values
section of |
pn |
A (possible empty) |
nodewarehouse |
A Node Warehouse which contains instructions for building nodes referenced in the Omega matrix but not in the network. |
defaultRule |
This should be a character scalar giving the name
of a CPTtools combination rule (see
|
defaultLink |
This should be a character scalar giving the name
of a CPTtools link function (see |
defaultAlpha |
A numeric scalar giving the default value for slope parameters. |
defaultBeta |
A numeric scalar giving the default value for difficulty (negative intercept) parameters. |
defaultLinkScale |
A positive number which gives the default value for the link scale parameter. |
defaultPriorWeight |
A positive number which gives the default value for the node prior weight hyper-parameter. |
debug |
A logical scalar. If true then
|
override |
A logical value. If false, differences between any exsiting structure in the graph and the Omega matrix will raise an error. If true, the graph will be modified to conform to the matrix. |
addTvals |
A logical value. If true, nodes which do not have
state values set, will have those state values set using the
function |
Whittaker (1990) noted that a normal Bayesian network (one in which all nodes followed a standard normal distribution) could be described using the inverse of the covariance matrix, often denoted Omega. In particular, zeros in the inverse covariance matrix represented variables which were conditionally independent, and therefore reducing the matrix to one with positive and zero values could provide the structure for a graphical model. Almond (2010) proposed using this as the basis for specifying discrete Bayesian networks for the proficiency model in educational assessments (especially as correlation matrixes among latent variables are a possible output of a factor analysis).
The Omega matrix is represented with a data.frame
object which contains two square submatrixes and a couple of auxiliary
columns. The first column should be named “Node” and contains
the names of the nodes. This defines a collection of nodes
which are defined in the Omega matrix. Let J
be the number of
nodes (rows in the data frame). The next J
columns should
have the names the nodes. Their values give the structural
component of the matrix. The following two columns are “Link”
and “Rules” these give the name of the combination rule and
link function to use for this row. Next follows another series J
“A” columns, each should have a name of the form
“A.node”. This defines a matrix A
containing
regression coefficients. Finally, there should be two additional
columns, “Intercept” and “PriorWeight”.
Let Q
be the logical matrix formed by the J
columns after the
first and let A
be the matrix of coefficients. The matrix
Q
gives the structure of the graph with Q[i,j]
being true
when Node j
is a parent of node i. By convention,
Q[j,j]=1
. Note that unlike the inverse covariance matrix from
which it gets its name, this matrix is not symmetric. It instead
reflects the (possibly arbitrary) directions assigned to the edges.
Except for the main diagonal, Q[i,j]
and Q[j,i]
will not
both be 1. Note also, that A[i,j]
should be positive only when
Q[i,j]=1
. This provides an additional check that structures
were correctly entered if the Omega matrix is being used for data
entry.
When the link function is set to normalLink
and the
rules is set of Compensatory
the model is described as a
series of regressions. Consider Node j
which has K
parents. Let \theta_j
be a real value corresponding to that
node and let \theta_k
be a real (standard normal)
value representing Parent Node k
a_k
represent the
corresponding coefficient from the A
-table. Let \sigma_j =
a_{j,j}
that is the diagonal element of the A
-table
corresponding to the variable under consideration. Let b_j
be
the value of the intercept column for Node j
. Then the model
specifies that theta_j
has a normal distribution with mean
\frac{1}{\sqrt{K}}\sum a_k\theta_k + b_j,
and standard
deviation \sigma_j
. The regression is discretized to calculate
the conditional probability table (see
normalLink
for details).
Note that the parameters are deliberately chosen to look like a
regression model. In particular, b_j
is a normal intercept and
not a difficulty parameter, so that in general
PnodeBetas
applied to the corresponding node will have
the opposite sign. The 1/\sqrt{K}
term is a variance
stabilization parameter so that the variance of \theta_j
will
not be affected by number of parents of Node j
. The multiple
R-squared for the regression model is
\frac{1/K \sum a_k^2}{ 1/K \sum a_k^2 + \sigma_j^2} .
This is often a more convenient parameter to elicit than
\sigma_j
.
The function Omega2Pnet
attempts to make adjustments to its
pnet
argument, which should be a Pnet
, so that it
conforms to the information given in the Omega matrix. Nodes are
created as necessary using information in the nodewarehouse
argument, which should be a Warehouse
object whose
manifest includes instructions for building the nodes in the network.
The warehouse supply function should either return an existing node in
pnet
or create a new node in pnet
. The structure of the
graph is adjusted to correspond to the Q-matrix (structural part of
the data frame). If the value of the override
argument is
false, an error is raised if there is existing structure with a
different topology. If override
is true, then the pnet
is destructively altered to conform to the structural information in
the Omega matrix.
The “Link” and “Rules” columns are used to set the
values of PnodeLink(node)
and
PnodeRules(node)
. The off-diagonal elements of the
A-matrix are used to set PnodeAlphas(node)
and the
diagonal elements to set PnodeLinkScale(node)
. The
values in the “Intercept” column are the negatives of the values
PnodeBetas(node)
. Finally, the values in the
“PriorWeight” column correspond to the values of
PnodePriorWeight(node)
. In any of these cases, if
the value in the Omega matrix is missing, then the default value will
be supplied instead.
One challenge is setting up a matrix with the correct structure. If
the nodes have been defined, the the Pnet2Omega
can be
used to create a blank matrix with the proper format which can then be
edited.
The network pnet
is returned. Note that it is destructively
modified by the commands to conform to the Omega matrix.
An Omega Matrix should be an object of class data.frame
with number of rows equal to the number of nodes. Throughout let
node stand for the name of a node.
The name of the node described in this column.
One column for each node. The value in this column should be 1 if the node in the column is regarded as a parent of the node referenced in the row.
The name of a link function. Currently, “normalLink” is the only value supported.
The name of the combination rule to use. Currently, “Compensatory” is recommended.
One column for each node. This should be a positive value if the corresponding node column has a 1. This gives the regression coefficient. If node corresponds to the current row, this is the residual standard deviation rather than a regression coefficient. See details.
A numeric value giving the change in prevalence for the two variables (see details).
The amount of weight which should be given to the
current values when learning conditional probability tables. See
PnodePriorWeight
.
As of version 0.6-2, the meaning of the debug
argument is
changed. In the new version, the
flog.logger
mechanism is used for
progress reports, and error reporting. In particular, setting
flog.threshold(DEBUG)
(or TRACE
)
will cause progress reports to be sent to the logging output.
The debug
argument has been repurposed. It now call
recover
when the error occurs, so that the problem can
be debugged.
This function destructively modifies pnet
and nodes referenced
in the Qmat and supplied by the warehouses.
Note that unlike typical R implementations, this is not necessarily safe. In particular, if the Qmat references 10 node, and an error is raised when trying to modify the 5th node, the first 4 nodes will be modified, the last 5 will not be and the 5th node may be partially modified. This is different from most R functions where changes are not committed unless the function returns successfully.
While the Omega matrix allows the user to specify both link function
and combination rule, the description of the Bayesian network as a
series of regressions only really makes sense when the link function
is normalLink
and the combination rule is
Compensatory
. These are included for future exapnsion.
The representation, using a single row of the data frame for each node
in the graph, only works well with the normal link function. In
particular, both the partial credit and graded response links require
the ability to specify different intercepts for different states of
the variable, something which is not supported in the Omega matrix.
Furthermore, the OffsetConjunctive
rule requires
multiple intercepts. Presumable the Conjunctive
rule
could be used, but the interpretation of the slope parameters is then
unclear. If the variables need a model other than the compensatory
normal model, it might be better to use a Q-matrix (see
Pnet2Qmat
to describe the variable.
Russell Almond
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley.
Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.
Almond, R. G. (presented 2017, August). Tabular views of Bayesian networks. In John-Mark Agosta and Tomas Singlair (Chair), Bayeisan Modeling Application Workshop 2017. Symposium conducted at the meeting of Association for Uncertainty in Artificial Intelligence, Sydney, Australia. (International) Retrieved from http://bmaw2017.azurewebsites.net/
The inverse operation is Pnet2Omega
.
See Warehouse
for description of the node warehouse
argument.
See normalLink
and
Compensatory
for more
information about the mathematical model.
The node attributes set from the Omega matrix include:
PnodeParents(node)
,
PnodeLink(node)
,
PnodeLinkScale(node)
,
PnodeRules(node)
,
PnodeAlphas(node)
,
PnodeBetas(node)
, and
PnodePriorWeight(node)
## Sample Omega matrix.
omegamat <- read.csv(system.file("auxdata", "miniPP-omega.csv",
package="Peanut"),
row.names=1,stringsAsFactors=FALSE)
## Not run:
library(PNetica) ## Needs PNetica
sess <- NeticaSession()
startSession(sess)
curd <- getwd()
netman1 <- read.csv(system.file("auxdata", "Mini-PP-Nets.csv",
package="Peanut"),
row.names=1,stringsAsFactors=FALSE)
nodeman1 <- read.csv(system.file("auxdata", "Mini-PP-Nodes.csv",
package="Peanut"),
stringsAsFactors=FALSE)
## Insures we are building nets from scratch
setwd(tempdir())
## Network and node warehouse, to create networks and nodes on demand.
Nethouse <- BNWarehouse(manifest=netman1,session=sess,key="Name")
Nodehouse <- NNWarehouse(manifest=nodeman1,
key=c("Model","NodeName"),
session=sess)
CM <- WarehouseSupply(Nethouse,"miniPP_CM")
CM1 <- Omega2Pnet(omegamat,CM,Nodehouse,override=TRUE,debug=TRUE)
Om2 <- Pnet2Omega(CM1,NetworkAllNodes(CM1))
DeleteNetwork(CM)
stopSession(sess)
setwd(curd)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.