View source: R/frontendbootstrap.R
bn.cv  R Documentation 
Perform a kfold or holdout crossvalidation for a learning algorithm or a fixed network structure.
bn.cv(data, bn, loss = NULL, ..., algorithm.args = list(),
loss.args = list(), fit, fit.args = list(), method = "kfold",
cluster, debug = FALSE)
## S3 method for class 'bn.kcv'
plot(x, ..., main, xlab, ylab, connect = FALSE)
## S3 method for class 'bn.kcv.list'
plot(x, ..., main, xlab, ylab, connect = FALSE)
loss(x)
data 
a data frame containing the variables in the model. 
bn 
either a character string (the label of the learning algorithm to
be applied to the training data in each iteration) or an object of class

loss 
a character string, the label of a loss function. If none is specified, the default loss function is the Classification Error for Bayesian networks classifiers; otherwise, the LogLikelihood Loss for both discrete and continuous data sets. See below for additional details. 
algorithm.args 
a list of extra arguments to be passed to the learning algorithm. 
loss.args 
a list of extra arguments to be passed to the loss function
specified by 
fit 
a character string, the label of the method used to fit the
parameters of the network. See 
fit.args 
additional arguments for the parameter estimation procedure,
see again 
method 
a character string, either 
cluster 
an optional cluster object from package parallel. 
debug 
a boolean value. If 
x 
an object of class 
... 
additional objects of class 
main , xlab , ylab 
the title of the plot, an array of labels for the boxplot, the label for the y axis. 
connect 
a logical value. If 
bn.cv()
returns an object of class bn.kcv.list
if runs
is at least 2, an object of class bn.kcv
if runs
is equal to 1.
loss()
returns a numeric vector with a length equal to runs
.
The following crossvalidation methods are implemented:
kfold: the data
are split in k
subsets of equal
size. For each subset in turn, bn
is fitted (and possibly learned
as well) on the other k  1
subsets and the loss function is then
computed using that subset. Loss estimates for each of the k
subsets are then combined to give an overall loss for data
.
customfolds: the data are manually partitioned by the user into subsets, which are then used as in kfold crossvalidation. Subsets are not constrained to have the same size, and every observation must be assigned to one subset.
holdout: k
subsamples of size m
are sampled
independently without replacement from the data
. For each subsample,
bn
is fitted (and possibly learned) on the remaining
m  nrow(data)
samples and the loss function is computed on the
m
observations in the subsample. The overall loss estimate is the
average of the k
loss estimates from the subsamples.
If crossvalidation is used with multiple runs
, the overall loss is the
averge of the loss estimates from the different runs.
To clarify, crossvalidation methods accept the following optional arguments:
k
: a positive integer number, the number of groups into which the
data will be split (in kfold crossvalidation) or the number of times
the data will be split in training and test samples (in holdout
crossvalidation).
m
: a positive integer number, the size of the test set in
holdout crossvalidation.
runs
: a positive integer number, the number of times
kfold or holdout crossvalidation will be run.
folds
: a list in which element corresponds to one fold and
contains the indices for the observations that are included to that fold;
or a list with an element for each run, in which each element is itself a
list of the folds to be used for that run.
The following loss functions are implemented:
LogLikelihood Loss (logl
): also known as negative
entropy or negentropy, it is the negated expected loglikelihood
of the test set for the Bayesian network fitted from the training set.
Lower valuer are better.
Gaussian LogLikelihood Loss (loglg
): the negated
expected loglikelihood for Gaussian Bayesian networks. Lower values are
better.
Classification Error (pred
): the prediction error
for a single node in a discrete network. Frequentist predictions are used,
so the values of the target node are predicted using only the information
present in its local distribution (from its parents). Lower values are
better.
Posterior Classification Error (predlw
and
predlwcg
): similar to the above, but predictions are computed
from an arbitrary set of nodes using likelihood weighting to obtain
Bayesian posterior estimates. predlw
applies to discrete Bayesian
networks, predlwcg
to (discrete nodes in) hybrid networks. Lower
values are better.
Exact Classification Error (predexact
): closedform
exact posterior predictions are available for Bayesian network
classifiers. Lower values are better.
Predictive Correlation (cor
): the correlation
between the observed and the predicted values for a single node in a
Gaussian Bayesian network. Higher values are better.
Posterior Predictive Correlation (corlw
and
corlwcg
): similar to the above, but predictions are computed from
an arbitrary set of nodes using likelihood weighting to obtain Bayesian
posterior estimates. corlw
applies to Gaussian networks and
corlwcg
to (continuous nodes in) hybrid networks. Higher values
are better.
Mean Squared Error (mse
): the mean squared error
between the observed and the predicted values for a single node in a
Gaussian Bayesian network. Lower values are better.
Posterior Mean Squared Error (mselw
and
mselwcg
): similar to the above, but predictions are computed from
an arbitrary set of nodes using likelihood weighting to obtain Bayesian
posterior estimates. mselw
applies to Gaussian networks and
mselwcg
to (continuous nodes in) hybrid networks. Lower values
are better.
Optional arguments that can be specified in loss.args
are:
target
: a character string, the label of target node for
prediction in all loss functions but logl
, loglg
and
loglcg
.
from
: a vector of character strings, the labels of the nodes
used to predict the target
node in predlw
, predlwcg
,
corlw
, corlwcg
, mselw
and mselwcg
. The
default is to use all the other nodes in the network. Loss functions
pred
, cor
and mse
implicitly predict only from the
parents of the target
node.
n
: a positive integer, the number of particles used by
likelihood weighting for predlw
, predlwcg
, corlw
,
corlwcg
, mselw
and mselwcg
.
The default value is 500
.
Note that if bn
is a Bayesian network classifier, pred
and
predlw
both give exact posterior predictions computed using the
closedform formulas for naive Bayes and TAN.
Both plot methods accept any combination of objects of class bn.kcv
or
bn.kcv.list
(the first as the x
argument, the remaining as the
...
argument) and plot the respected expected loss values side by side.
For a bn.kcv
object, this mean a single point; for a bn.kcv.list
object this means a boxplot.
Marco Scutari
Koller D, Friedman N (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
bn.boot
, rbn
, bn.kcvclass
.
bn.cv(learning.test, 'hc', loss = "pred", loss.args = list(target = "F"))
folds = list(1:2000, 2001:3000, 3001:5000)
bn.cv(learning.test, 'hc', loss = "logl", method = "customfolds",
folds = folds)
xval = bn.cv(gaussian.test, 'mmhc', method = "holdout",
k = 5, m = 50, runs = 2)
xval
loss(xval)
## Not run:
# comparing algorithms with multiple runs of crossvalidation.
gaussian.subset = gaussian.test[1:50, ]
cv.gs = bn.cv(gaussian.subset, 'gs', runs = 10)
cv.iamb = bn.cv(gaussian.subset, 'iamb', runs = 10)
cv.inter = bn.cv(gaussian.subset, 'inter.iamb', runs = 10)
plot(cv.gs, cv.iamb, cv.inter,
xlab = c("GrowShrink", "IAMB", "InterIAMB"), connect = TRUE)
# use custom folds.
folds = split(sample(nrow(gaussian.subset)), seq(5))
bn.cv(gaussian.subset, "hc", method = "customfolds", folds = folds)
# multiple runs, with custom folds.
folds = replicate(5, split(sample(nrow(gaussian.subset)), seq(5)),
simplify = FALSE)
bn.cv(gaussian.subset, "hc", method = "customfolds", folds = folds)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.