lattice: Cluster Lattices

latticeR Documentation

Cluster Lattices

Description

Computations on the lattice of all (hard) partitions, or the lattice of all dendrograms, or the meet semilattice of all hierarchies (n-trees) of/on a set of objects: meet, join, and comparisons.

Usage

cl_meet(x, y)
cl_join(x, y)

Arguments

x

an ensemble of partitions or dendrograms or hierarchies, or an R object representing a partition or dendrogram or hierarchy.

y

an R object representing a partition or dendrogram or hierarchy. Ignored if x is an ensemble.

Details

For a given finite set of objects X, the set H(X) of all (hard) partitions of X can be partially ordered by defining a partition P to be “finer” than a partition Q, i.e., P ≤ Q, if each class of P is contained in some class of Q. With this partial order, H(X) becomes a bounded lattice, with intersection and union of two elements given by their greatest lower bound (meet) and their least upper bound (join), respectively.

Specifically, the meet of two partitions computed by cl_meet is the partition obtained by intersecting the classes of the partitions; the classes of the join computed by cl_join are obtained by joining all elements in the same class in at least one of the partitions. Obviously, the least and greatest elements of the partition lattice are the partitions where each object is in a single class (sometimes referred to as the “splitter” partition) or in the same class (the “lumper” partition), respectively. Meet and join of an arbitrary number of partitions can be defined recursively.

In addition to computing the meet and join, the comparison operations corresponding to the above partial order as well as min, max, and range are available at least for R objects representing partitions inheriting from "cl_partition". The summary methods give the meet and join of the given partitions (for min and max), or a partition ensemble with the meet and join (for range).

If the partitions specified by x and y are soft partitions, the corresponding nearest hard partitions are used. Future versions may optionally provide suitable “soft” (fuzzy) extensions for computing meets and joins.

The set of all dendrograms on X can be ordered using pointwise inequality of the associated ultrametric dissimilarities: i.e., if D and E are the dendrograms with ultrametrics u and v, respectively, then D ≤ E if u_{ij} ≤ v_{ij} for all pairs (i, j) of objects. This again yields a lattice (of dendrograms). The join of D and E is the dendrogram with ultrametrics given by \max(u_{ij}, v_{ij}) (as this gives an ultrametric); the meet is the dendrogram with the maximal ultrametric dominated by \min(u_{ij}, v_{ij}), and can be obtained by applying single linkage hierarchical clustering to the minima.

The set of all hierarchies on X can be ordered by set-wise inclusion of the classes: i.e., if H and G are two hierarchies, then H ≤ G if all classes of H are also classes of G. This yields a meet semilattice, with meet given by the classes contained in both hierarchies. The join only exists if the union of the classes is a hierarchy.

In each case, a modular semilattice is obtained, which allows for a natural metrization via least element (semi)lattice move distances, see Barthélémy, Leclerc and Monjardet (1981). These latticial metrics are given by the BA/C (partitions), Manhattan (dendrograms), and symdiff (hierarchies) dissimilarities, respectively (see cl_dissimilarity).

Value

For cl_meet and cl_join, an object of class "cl_partition" or "cl_dendrogram" with the class ids or ultrametric dissimilarities of the meet and join of the partitions or dendrograms, respectively.

References

J.-P. Barthélémy, B. Leclerc and B. Monjardet (1981). On the use of ordered sets in problems of comparison and consensus of classification. Journal of Classification, 3, 187–224. doi: 10.1007/BF01894188.

Examples

## Two simple partitions of 7 objects.
A <- as.cl_partition(c(1, 1, 2, 3, 3, 5, 5))
B <- as.cl_partition(c(1, 2, 2, 3, 4, 5, 5))
## These disagree on objects 1-3, A splits objects 4 and 5 into
## separate classes.  Objects 6 and 7 are always in the same class.
(A <= B) || (B <= A)
## (Neither partition is finer than the other.)
cl_meet(A, B)
cl_join(A, B)
## Meeting with the lumper (greatest) or joining with the splitter
## (least) partition does not make a difference: 
C_lumper <- as.cl_partition(rep(1, n_of_objects(A)))
cl_meet(cl_ensemble(A, B, C_lumper))
C_splitter <- as.cl_partition(seq_len(n_of_objects(A)))
cl_join(cl_ensemble(A, B, C_splitter))
## Another way of computing the join:
range(A, B, C_splitter)$max

clue documentation built on Nov. 19, 2022, 5:05 p.m.