cnSearchOrder: Network Search for Given Node Order
In sdnet: Soft-Discretization-Based Bayesian Network Inference

Description Usage Arguments Details Value Author(s) Examples

The function implements a MLE based algorithm to search for optimal networks complying with a given node order. It returns a list of networks, with complexities up to some maximal value, that best fit the data.

cnSearchOrder(data, pert=NULL, 
	maxParentSet=0, parentSizes=NULL, maxComplexity=0, 
	nodeOrder=NULL, 
	nodeCats=NULL, parentsPool=NULL, fixedParents=NULL, edgeProb=NULL,
	echo=FALSE, softmode=FALSE, dagsOnly = FALSE, classes = NULL, clsdist=1)

`data`	a `matrix` in row-nodes format or a `data.frame` in column-nodes format
`pert`	a binary matrix with the dimensions of `data`. A value 1 marks that the node in the corresponding sample as perturbed
`maxParentSet`	an `integer`, maximal number of parents for all nodes
`parentSizes`	an `integer` vector, maximal number of parents per node
`maxComplexity`	an `integer`, the maximal network complexity for the search
`nodeOrder`	a `vector` specifying a node order; the search is among the networks consistent with this topological order
`nodeCats`	a `list` of node categories
`parentsPool`	a `list` of parent sets to choose from
`fixedParents`	a `list` of parent sets to choose from
`edgeProb`	a square `matrix` of length the number of nodes specifying prior edge probabilities
`echo`	a `logical`, turns on/off some progress information
`softmode`	a `logical`, turns on/off the soft quantization mode
`dagsOnly`	a `logical`, selects between catNetwork and DAG only search
`classes`	a binary matrix with the dimensions of `data` that assigns a class to each node-observation
`clsdist`	class separation distance function, currently 1(‘chisq’) and 2(‘kl’) are supported

The data can be a matrix of character categories with rows specifying the node-variables and columns assumed to be independent samples from an unknown network, or a data.frame with columns specifying the nodes and rows being the samples.

The number of node categories are obtained from the sample. If given, the nodeCats is used as a list of categories. In that case, nodeCats should include the node categories presented in the data.

The function returns a list of networks, one for each admissible complexity within the specified range. The networks in the list are the Maximum Likelihood estimates in the class of networks having the given topological order of the nodes and complexity. When maxComplexity is not given, thus zero, its value is reset to the maximum possible complexity for the given parent set size. When nodeOrder is not given or NULL, the order of the nodes in the data is taken, 1,2,....

The parameters parentsPool and fixedParents allow the user to put some exclusion/inclusion constrains on the possible parenthood of the nodes. They should be given as lists of index vectors, one for each node.

The rows in edgeProb correspond to the nodes in the sample. The [i,j]-th element in edgeProb specifies a prior probability for the j-th node to be a parent of the i-th one. In calculating the prior probability of a network all edges are assumed independent Bernoulli random variables. The elements of edgeProb are cropped in the range [0,1], such that the zero probabilities effectively exclude the corresponding edges, while the ones force them.

A catNetworkEvaluate object

N. Balov

  cnet <- cnRandomCatnet(numnodes=12, maxpars=3, numcats=2)
  psamples <- cnSamples(object=cnet, numsamples=100)
  nodeOrder <- sample(1:12)
  nets <- cnSearchOrder(data=psamples, pert=NULL, 
		maxParentSet=2, maxComplexity=36, nodeOrder)
  ## next we find the network with complexity of the original one and plot it
  cc <- cnComplexity(object=cnet)
  cnFind(object=nets, complx=cc)