fit_ultrametric_target | R Documentation |
Find the ultrametric from a target equivalence class of hierarchies which minimizes weighted Euclidean or Manhattan dissimilarity to a given dissimilarity object.
ls_fit_ultrametric_target(x, y, weights = 1)
l1_fit_ultrametric_target(x, y, weights = 1)
x |
a dissimilarity object inheriting from class
|
y |
a target hierarchy. |
weights |
a numeric vector or matrix with non-negative weights
for obtaining a weighted fit. If a matrix, its numbers of rows and
columns must be the same as the number of objects in |
The target equivalence class consists of all dendrograms for which the
corresponding n
-trees are the same as the one corresponding to
y
. I.e., all splits are the same as for y
, and
optimization is over the height of the splits.
The criterion function to be optimized over all ultrametrics from the
equivalence class is \sum w_{ij} |x_{ij} - u_{ij}|^p
, where
p = 2
in the Euclidean and p = 1
in the Manhattan case,
respectively.
The optimum can be computed as follows. Suppose split s
joins
object classes A
and B
. As the ultrametric
dissimilarities of all objects in A
to all objects in B
must be the same value, say, u_{A,B} = u_s
, the contribution
from the split to the criterion function is of the form
f_s(u_s) = \sum_{i \in A, j \in B} w_{ij} |x_{ij} - u_s|^p
.
We need to minimize \sum_s f_s(u_s)
under the constraint that
the u_s
form a non-decreasing sequence, which is accomplished by
using the Pool Adjacent Violator Algorithm (PAVA) using the
weighted mean (p = 2
) or weighted median (p = 1
) for
solving the blockwise optimization problems.
An object of class "cl_ultrametric"
containing the
optimal ultrametric distances.
ls_fit_ultrametric
for finding the ultrametric
minimizing Euclidean dissimilarity (without fixing the splits).
data("Phonemes")
## Note that the Phonemes data set has the consonant misclassification
## probabilities, i.e., the similarities between the phonemes.
d <- as.dist(1 - Phonemes)
## Find the maximal dominated and miminal dominating ultrametrics by
## hclust() with single and complete linkage:
y1 <- hclust(d, "single")
y2 <- hclust(d, "complete")
## Note that these are quite different:
cl_dissimilarity(y1, y2, "gamma")
## Now find the L2 optimal members of the respective dendrogram
## equivalence classes.
u1 <- ls_fit_ultrametric_target(d, y1)
u2 <- ls_fit_ultrametric_target(d, y2)
## Compute the L2 optimal ultrametric approximation to d.
u <- ls_fit_ultrametric(d)
## And compare ...
cl_dissimilarity(cl_ensemble(Opt = u, Single = u1, Complete = u2), d)
## The solution obtained via complete linkage is quite close:
cl_agreement(u2, u, "cophenetic")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.