https://cran.r-project.org/web/packages/erhcv/index.html
Assesses the statistical significance of clusters for a given dataset through bootstrapping and hypothesis testing of a given matrix of empirical Spearman's rho, based on the technique of S. Gaiser et al. (2010).
In this package, a tree structure consists of nested lists, which is very natural. To explain the concept, let us use the following custom data.tree object. The associated tree structure is then constructed as follows.
treeStructure <- list(list(list(7, 8), 6, 5, list(9, 10)), 2, 1, list(3, 4))
Note that one can transform this tree into a data.tree object using tree2plot.
suppressMessages( { library(erhcv); library(nCopula) } ) tree2plot(treeStructure, structure = TRUE) # data.tree
As a general guideline, a tree structure needs to be a list. Then, one distinguishes nodes from leafs by object types. Indeed, nodes are lists, and leafs are integers. The construction is analogous to the structure construction of the package nCopula.
Function | Description ----------| -------------------------------------------------------------------- VerifyTree | Main function of the package, used to clean raw hclust clustering hclust2tree | Transforms a hclust object into a tree structure (under the convention of this package) tree2plot | Illustration of a tree (or data.tree structure)
For a proper demonstration of the package, we use the package nCopula to sample hierarchical data.
# install.packages(erhcv); install.packages(nCopula)
## Build structure structure <- GEO(0.5, 1:2, list(GAMMA(1/2, 3:4, NULL), GEO(0.3, 5:6, list(GAMMA(1/3, 7:8, NULL), GAMMA(1/3, 9:10, NULL))))) ## Sample from the structure set.seed(123) U.. <- rCompCop(1000, structure) ## Compute Spearman correlation matrix Spearman <- cor(U.., method = "sp") ## Cluster Spearman matrix distance <- dist(Spearman, method = "maximum") clustering <- hclust(distance, method = "average") ## Transform clustering into nested lists tree <- erhcv::hclust2tree(clustering) erhcv::tree2plot(tree, structure = TRUE) # data.tree object
We rapidly see that the obtained structure is far from the original one. We then use tools from erhcv to eliminate unnecessary nodes, based on our (subjective) simplification level alpha.
Here, we make use of VerifyTree to chop down nodes of the clustering we obtained earlier.
## Clean the tree alpha <- 1 # Severe simplification parameter cleanedTree <- erhcv::VerifyTree(U.., alpha = alpha, distance.method = "maximum", hclust.method = "average")$Tree ## Visualize output erhcv::tree2plot(cleanedTree, structure = TRUE)
Bellow is the associated tree structure.
erhcv::tree2plot(cleanedTree, structure = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.