| fit_ultrametric_target | R Documentation |
Find the ultrametric from a target equivalence class of hierarchies which minimizes weighted Euclidean or Manhattan dissimilarity to a given dissimilarity object.
ls_fit_ultrametric_target(x, y, weights = 1)
l1_fit_ultrametric_target(x, y, weights = 1)
x |
a dissimilarity object inheriting from class
|
y |
a target hierarchy. |
weights |
a numeric vector or matrix with non-negative weights
for obtaining a weighted fit. If a matrix, its numbers of rows and
columns must be the same as the number of objects in |
The target equivalence class consists of all dendrograms for which the
corresponding n-trees are the same as the one corresponding to
y. I.e., all splits are the same as for y, and
optimization is over the height of the splits.
The criterion function to be optimized over all ultrametrics from the
equivalence class is \sum w_{ij} |x_{ij} - u_{ij}|^p, where
p = 2 in the Euclidean and p = 1 in the Manhattan case,
respectively.
The optimum can be computed as follows. Suppose split s joins
object classes A and B. As the ultrametric
dissimilarities of all objects in A to all objects in B
must be the same value, say, u_{A,B} = u_s, the contribution
from the split to the criterion function is of the form
f_s(u_s) = \sum_{i \in A, j \in B} w_{ij} |x_{ij} - u_s|^p.
We need to minimize \sum_s f_s(u_s) under the constraint that
the u_s form a non-decreasing sequence, which is accomplished by
using the Pool Adjacent Violator Algorithm (PAVA) using the
weighted mean (p = 2) or weighted median (p = 1) for
solving the blockwise optimization problems.
An object of class "cl_ultrametric" containing the
optimal ultrametric distances.
ls_fit_ultrametric for finding the ultrametric
minimizing Euclidean dissimilarity (without fixing the splits).
data("Phonemes")
## Note that the Phonemes data set has the consonant misclassification
## probabilities, i.e., the similarities between the phonemes.
d <- as.dist(1 - Phonemes)
## Find the maximal dominated and miminal dominating ultrametrics by
## hclust() with single and complete linkage:
y1 <- hclust(d, "single")
y2 <- hclust(d, "complete")
## Note that these are quite different:
cl_dissimilarity(y1, y2, "gamma")
## Now find the L2 optimal members of the respective dendrogram
## equivalence classes.
u1 <- ls_fit_ultrametric_target(d, y1)
u2 <- ls_fit_ultrametric_target(d, y2)
## Compute the L2 optimal ultrametric approximation to d.
u <- ls_fit_ultrametric(d)
## And compare ...
cl_dissimilarity(cl_ensemble(Opt = u, Single = u1, Complete = u2), d)
## The solution obtained via complete linkage is quite close:
cl_agreement(u2, u, "cophenetic")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.