Build a hierarchical tree based on hierarchical clustering of the variables.
1 2 3
a matrix or list of matrices for multiple data sets. The matrix or
matrices have to be of type numeric and are required to have column names
/ variable names. The rows and the columns represent the observations and
the variables, respectively. Either the argument
a dissimilarity matrix. This can be either a symmetric matrix of
type numeric with column and row names or an object of class
a data frame or matrix specifying the second level of the hierarchical tree. The first column is required to contain the variable names and to be of type character. The second column is required to contain the group assignment and to be a vector of type character or numeric. If not supplied, the second level is built based on the data.
the agglomeration method to be used for the hierarchical
the method to be used for computing covariances in the presence
of missing values. This is important for multiple data sets which do not measure
exactly the same variables. If data is specified using the argument
a logical indicating whether the values are sorted with respect to the size of the block. This can reduce the run time for parallel computation.
type of parallel computation to be used. See the 'Details' section.
number of processes to be run in parallel.
an optional parallel or snow cluster used if
The hierarchical tree is built by hierarchical clustering of the variables.
Either the data (using the argument
x) or a dissimilarity matrix
(using the argument
d) can be specified.
If one or multiple data sets are defined using the argument
the dissimilarity matrix is calculated by one minus squared empirical
correlation. In the case of multiple data sets, a single hierarchical
tree is jointly estimated using hierarchical clustering. The argument
use is important because missing values are introduced if the
data sets do not measure exactly the same variables. The argument
use determines how the empirical correlation is calculated.
Alternatively, it is possible to specify a user-defined dissimilarity
matrix using the argument
If the argument
block are supplied, i.e. the
block defines the second level of the
hierarchical tree, the function can be run in parallel across
the different blocks by specifying the arguments
ncpus. There is an optional argument
parallel = "snow". There are three possibilities to set the
parallel = "no" for serial evaluation
parallel = "multicore" for parallel evaluation
using forking, and
parallel = "snow" for parallel evaluation
using a parallel socket cluster. It is recommended to select
RNGkind("L'Ecuyer-CMRG") and set a seed to ensure that
the parallel computing of the package
hierinf is reproducible.
This way each processor gets a different substream of the pseudo random
number generator stream which makes the results reproducible if the arguments
ncpus) remain unchanged. See the vignette
or the reference for more details.
The returned value is an object of class
consisting of two elements, the argument
"block" and the
"block" defines the second level of the hierarchical
tree if supplied.
"res.tree" contains a
for each of the blocks defined in the argument
If the argument
NULL (i.e. not supplied),
the element contains only one
Renaux, C. et al. (2018), Hierarchical inference for genome-wide association studies: a view on methodology with software. (arXiv:1805.02988)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
library(MASS) x <- mvrnorm(200, mu = rep(0, 500), Sigma = diag(500)) colnames(x) <- paste0("Var", 1:500) dendr1 <- cluster_var(x = x) # The column names of the data frame block are optional. block <- data.frame("var.name" = paste0("Var", 1:500), "block" = rep(c(1, 2), each = 250), stringsAsFactors = FALSE) dendr2 <- cluster_var(x = x, block = block) # The matrix x is first transposed because the function dist calculates # distances between the rows. d <- dist(t(x)) dendr3 <- cluster_var(d = d, method = "single")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.