Description Usage Arguments Details Value Logic Sampling Likelihood Weighting Author(s) References Examples
Perform conditional probability queries (CPQs).
1 2 3 4 5 6 |
fitted |
an object of class |
x |
an object of class |
event, evidence |
see below. |
nodes |
a vector of character strings, the labels of the nodes whose conditional distribution we are interested in. |
cluster |
an optional cluster object from package parallel.
See |
method |
a character string, the method used to perform the conditional
probability query. Currently only logic sampling ( |
... |
additional tuning parameters. |
debug |
a boolean value. If |
cpquery
estimates the conditional probability of event
given
evidence
using the method specified in the method
argument.
cpdist
generates random observations conditional on the
evidence
using the method specified in the method
argument.
mutilated
constructs the mutilated network used for sampling
in likelihood weighting.
cpquery
returns a numeric value, the conditional probability of event
conditional on evidence
.
cpudist
returns a data frame containing the observations generated from
the conditional distribution of the nodes
conditional on evidence
.
The data frame has class c("bn.cpdist", "data.frame")
, and a method
attribute storing the value of the method
argument. In the case of
likelihood weighting, the weights are also attached as an attribute called
weights
.
mutilated
returns a bn
or bn.fit
object, depending on the
class of x
.
The event
and evidence
arguments must be two expressions
describing the event of interest and the conditioning evidence in a
format such that, if we denote with data
the data set the network
was learned from, data[evidence, ]
and data[event, ]
return the correct observations.
If either event
or evidence
is set to TRUE
an
unconditional probability query is performed with respect to that argument.
Three tuning parameters are available:
n
: a positive integer number, the number of random observations
to generate from fitted
. Defaults to 5000 * log10(nparams.fitted(fitted))
for discrete networks and
500 * nparams.fitted(fitted)
for Gaussian
networks.
batch
: a positive integer number, the size of each batch
of random observations. Defaults to 10^4
.
query.nodes
: a a vector of character strings, the labels of
the nodes involved in event
and evidence
. Simple queries
do not require to generate observations from all the nodes in the
network, so cpquery
and cpdist
try to identify which
nodes are used in event
and evidence
and reduce the
network to their upper closure. query.nodes
may be used to
manually specify these nodes when automatic identification fails; there
is no reason to use it otherwise.
Note that the number of observations returned by cpdist
is always
smaller than n
, because logic sampling is a form of rejection
sampling. Therefore, only the obervations matching evidence
(out
of the n
that are generated) are returned, and their number depends
on the probability of evidence
.
The event
argument must be an expression describing the event of
interest, as in logic sampling. The evidence
argument must be a
named list:
Each element corresponds to one node in the network and must contain the value that node will be set to when sampling.
In the case of a continuous node, two values can also be provided. In that case, the value for that node will be sampled from a uniform distribution on the interval delimited by the specified values.
In the case of a discrete or ordinal node, two or more values can also be provided. In that case, the value for that node will be sampled with uniform probability from the set of specified values.
If either event
or evidence
is set to TRUE
an
unconditional probability query is performed with respect to that argument.
Tuning parameters are the same as for logic sampling: n
, batch
and query.nodes
.
Note that the observations returned by cpdist
are generated from
the mutilated network, and need to be weighted appropriately when computing
summary statistics (for more details, see the references below).
cpquery
does that automatically when computing the final conditional
probability. Also note that the batch
argument is ignored in cpdist
for speed and memory efficiency.
Marco Scutari
Koller D, Friedman N (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
Korb K, Nicholson AE (2010). Bayesian Artificial Intelligence. Chapman & Hall/CRC, 2nd edition.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | ## discrete Bayesian network (it is the same with ordinal nodes).
data(learning.test)
fitted = bn.fit(hc(learning.test), learning.test)
# the result should be around 0.025.
cpquery(fitted, (B == "b"), (A == "a"))
# for a single observation, predict the value of a single
# variable conditional on the others.
var = names(learning.test)
obs = 2
str = paste("(", names(learning.test)[-3], "=='",
sapply(learning.test[obs,-3], as.character), "')",
sep = "", collapse = " & ")
str
str2 = paste("(", names(learning.test)[3], "=='",
as.character(learning.test[obs, 3]), "')", sep = "")
str2
cpquery(fitted, eval(parse(text = str2)), eval(parse(text = str)))
# do the same with likelihood weighting
cpquery(fitted, event = eval(parse(text = str2)),
evidence = as.list(learning.test[2, -3]), method = "lw")
# conditional distribution of A given C == "c".
table(cpdist(fitted, "A", (C == "c")))
## Gaussian Bayesian network.
data(gaussian.test)
fitted = bn.fit(hc(gaussian.test), gaussian.test)
# the result should be around 0.04.
cpquery(fitted,
event = ((A >= 0) & (A <= 1)) & ((B >= 0) & (B <= 3)),
evidence = (C + D < 10))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.