SEMtree | R Documentation |
Four tree-based structure learning methods are implemented with graph and data-driven algorithms.
SEMtree(
graph,
data,
seed,
type = "ST",
eweight = NULL,
alpha = 0.05,
verbose = FALSE,
...
)
graph |
An igraph object. |
data |
A matrix or data.frame. Rows correspond to subjects, and columns to graph nodes (variables). |
seed |
A vector of seed nodes. |
type |
Tree-based structure learning method. Four algorithms are available:
|
eweight |
Edge weight type for igraph object can be externally derived
using
|
alpha |
Threshold for rejecting a pair of node being independent in
"CPDAG" algorithm. The latter implements a natural v-structure identification
procedure by thresholding the pairwise sample correlations over all adjacent
pairs of edges with some appropriate threshold. By default,
|
verbose |
If TRUE, it shows the output tree (not recommended for large graphs). |
... |
Currently ignored. |
A tree ia an acyclic graph with p vertices and p-1 edges. The graph method
refers to the Steiner Tree (ST), a tree from an undirected graph that connect "seed"
with additional nodes in the "most compact" way possible. The data-driven methods
propose fast and scalable procedures based on Chu-Liu–Edmonds’ algorithm (CLE) to
recover a tree from a full graph. The first method, called Causal Additive Trees (CAT)
uses pairwise mutual weights as input for CLE algorithm to recover a directed tree
(an "arborescence"). The second one applies CLE algorithm for skeleton recovery and
extends the skeleton to a tree (a "polytree") represented by a Completed Partially
Directed Acyclic Graph (CPDAG). Finally, the Minimum Spanning Tree (MST) connecting
an undirected graph with minimal edge weights can be identified.
To note, if the input graph is a directed graph, ST and MST undirected trees are
converted in directed trees using the orientEdges
function.
An igraph
object. If type = "ST"
, seed nodes are
colored in "aquamarine" and connectors in "white". If type = "ST"
and
type = "MST"
, edges are colored in "green" if not present in the input,
graph. If type = "CPDAG"
, bidirected edges are colored in "black"
(if the algorithm is not able to establish the direction of the relationship
between x and y).
Mario Grassi mario.grassi@unipv.it
Grassi M, Tarantino B (2023). SEMtree: tree-based structure learning methods with structural equation models. Bioinformatics, 39 (6), 4829–4830 <https://doi.org/10.1093/bioinformatics/btad377>
Kou, L., Markowsky, G., Berman, L. (1981). A fast algorithm for Steiner trees. Acta Informatica 15, 141–145. <https://doi.org/10.1007/BF00288961>
Prim, R.C. (1957). Shortest connection networks and some generalizations Bell System Technical Journal, 37 1389–1401.
Chow, C.K. and Liu, C. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14(3):462–467.
Meek, C. (1995). Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, 403–410.
Jakobsen, M, Shah, R., Bühlmann, P., Peters, J. (2022). Structure Learning for Directed Trees. arXiv: <https://doi.org/10.48550/arxiv.2108.08871>.
Lou, X., Hu, Y., Li, X. (2022). Linear Polytree Structural Equation Models: Structural Learning and Inverse Correlation Estimation. arXiv: <https://doi.org/10.48550/arxiv.2107.10955>
# Nonparanormal(npn) transformation
als.npn <- transformData(alsData$exprs)$data
# graph-based trees
graph <- alsData$graph
seed <- V(graph)$name[sample(1:vcount(graph), 10)]
tree1 <- SEMtree(graph, als.npn, seed=seed, type="ST", verbose=TRUE)
tree2 <- SEMtree(graph, als.npn, seed=NULL, type="MST", verbose=TRUE)
# data-driven trees
V <- colnames(als.npn)[colnames(als.npn) %in% V(graph)$name]
tree3 <- SEMtree(NULL, als.npn, seed=V, type="CAT", verbose=TRUE)
tree4 <- SEMtree(NULL, als.npn, seed=V, type="CPDAG", alpha=0.05, verbose=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.