Description Usage Arguments Details Value See Also
The 'covariance explained' by each predictor is the reduction in covariance between each pair of outcomes due to splitting on each predictor over all trees ($covex
).
To aid in the interpretability of the covariance explained matrix, this function clusters the rows (pairs of outcomes) and the columns (predictors) of object$covex
so that groups of predictors that explain similar pairs of covariances are closer together.
This function can also be used to cluster the relative influence matrix. In this case, the rows (usually outcomes) and columns (usually predictors) with similar values will
be clustered together.
1 2 | mvtb.cluster(x, clust.method = "complete", dist.method = "euclidean",
plot = FALSE, ...)
|
x |
Any matrix, such as |
clust.method |
clustering method for rows and columns. This should be (an unambiguous abbreviation of) one of |
dist.method |
method for computing the distance between two lower triangular covariance matrices. This must be one of |
plot |
Produces a heatmap of the covariance explained matrix. see |
... |
Arguments passed to |
The covariance explained by each predictor is only unambiguous if the predictors are uncorrelated and interaction.depth = 1. If predictors are not independent, the decomposition of covariance explained is only approximate (like the decomposition of R^2 by each predictor in a linear model). If interaction.depth > 1, the following heuristic is used: the covariance explained by the tree is assigned to the predictor with the largest influence in each tree.
Note that different distances measures (e.g. "manhattan"
, "euclidean"
) provide different ways to measure (dis)similarities between
the covariance explained patterns for each predictor. See ?dist
for further details.
After the distances have been computed, hclust
is used to form clusters.
Different clustering methods (e.g. "ward.D"
, "complete"
) generally group rows and columns differently (see ?hclust
for further details).
It is suggested to try different distance measures and clustering methods to obtain the most interpretable solution.
The defaults are for "euclidean"
distances and "complete"
clustering.
Transposing the rows and columns may also lead to different results.
A simple heatmap of the clustered matrix can be obtained by setting plot=TRUE
. Details of the plotting procedure are available via mvtb.heat
.
covex
values smaller than getOption("digits")
are truncated to 0. Note that it is possible to obtain negative variance explained
due to sampling fluctuation. These can be truncated or ignored.
clustered covariance matrix, with re-ordered rows and columns.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.