tcherry_step: Determines a third order t-cherry tree from data
In nvihrs14/tcherry: Learning the structure of tcherry trees

Description Usage Arguments Details Value Author(s) See Also Examples

Determines the structure of a third order t-cherry tree from data based on a greedy stepwise approach.

1	tcherry_step(data, ...)

`data`	The data the tree structure should be based on.
`...`	Additional arguments passed to `MI2` and `MI3`.

This function i mainly kept for historical purposes, and it is recommended to use k_tcherry_step with k=3, because this function runs faster.

The algorithm for constructing the third order t-cherry tree from data is based on an atempt to minimize the Kullback-Leibler divergence. The first cherry is chosen as the triplet with highest mutual information. This is the preliminary third order t-cherry tree. Then all possible new cherries are added stepwise to this tree and the weight

∑ MI3(clique) - ∑ MI2(separator)

is calculated. The first sum is over the cliques and the second over the separators of the junction tree of the preliminary third order t-cherry tree. The one with the highest weight is chosen as the new preliminary third order t-cherry tree, and the procedure is repeated untill all variables has been added.

A list containing the following components:

adj_matrix The adjacency matrix for the third order t-cherry tree.
weight The weight of the final third order t-cherry tree.
cliques A list containing the cliques (cherries) of the third order t-cherry tree.
separators A list containing the separators of a junction tree for the third order t-cherry tree.

Katrine Kirkeby, enir_tak@hotmail.com

Maria Knudsen, mariaknudsen@hotmail.dk

Ninna Vihrs, ninnavihrs@hotmail.dk

k_tcherry_step for a better implementation, MI2 and MI3 for mutual information of two and three variables respectively.

set.seed(43)
var1 <- c(sample(c(1, 2), 100, replace = TRUE))
var2 <- var1 + c(sample(c(1, 2), 100, replace = TRUE))
var3 <- var1 + c(sample(c(0, 1), 100, replace = TRUE,
                        prob = c(0.9, 0.1)))
var4 <- c(sample(c(1, 2), 100, replace = TRUE))
var5 <- var2 + var3
var6 <- var1 - var4 + c(sample(c(1, 2), 100, replace = TRUE))
var7 <- c(sample(c(1, 2), 100, replace = TRUE))

data <- data.frame("var1" = as.character(var1),
                   "var2" = as.character(var2),
                   "var3" = as.character(var3),
                   "var4" = as.character(var4),
                   "var5" = as.character(var5),
                   "var6" = as.character(var6),
                   "var7" = as.character(var7))

# smooth used in both MI2 and MI3
(tch <- tcherry_step(data, smooth = 0.1))