consClust: Crisp Consensus Clustering
In WeightedCluster: Clustering of Weighted Data and Robust Clustering

consClust

R Documentation

Crisp Consensus Clustering

Description

Compute consensus clustering for different number of clusters (Monti, 2003). The function further computes cluster quality and consensus agreement measures.

Usage

consClust(diss,
          base.clust = "pam", 
          R = 100, 
          kvals = 2:15,
          cons.method = "SE", 
          membership = "crisp",
          k.fixed = TRUE, 
          agg.method = "cRand",
          keep.ensemble = TRUE,
          parallel = FALSE,
          progressbar = TRUE)

Arguments

`diss`	A dissimilarity matrix or a `dist` object.
`base.clust`	Character. Clustering algorithms used to compute the ensemble of partitions and hierarchies. May be a combination of `"pam", "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median"`.
`R`	Numeric. The number of partitions or hierarchies to compute for consensus clustering.
`kvals`	Numeric vector. The number of clusters to compute, default `2:15`
`cons.method`	Character. The consensus clustering method to use, can be one of `"SE"` (default), `"HE", "SM", "HM", "GV1", "DWH", "GV3", "soft/symdiff", "hard/symdiff"`. See `cl_consensus` for details on the methods.
`membership`	Character. If `"crisp"`, the consensus clustering is returned as vectors of crisp cluster labels. If `"fuzzy"` the function returns fuzzy membership martices.
`k.fixed`	Logical. If `TRUE` (default), the number of clusters obtained from the consensus cannot exceed the number of cluster in the partition ensemble.
`agg.method`	Character. The consensus agreement measures to compute, may be a comination of `"cRand"` (default), `"Rand", "euclidean", "manhattan", "NMI", "KP", "angle", "diag", "FM", "Jaccard", "puritiy", "PS"`. See `cl_agreement` for details on the methods.
`keep.ensemble`	Logical. If `TRUE` (default) partitions and/or hierarchies are returned by the function. Setting `keep.ensemble = FALSE` saves memory.
`parallel`	Logical. Whether to initialize the parallel processing of the `future` package using the default `multisession` strategy. If `FALSE` (default), then the current `plan` is used. If `TRUE`, `multisession` `plan` is initialized using default values.
`progressbar`	Logical. Whether to initialize a progress bar using the `future` package. If `FALSE` (default), then the current progress bar `handlers` is used . If `TRUE`, a new global progress bar `handlers` is initialized.

Details

consClust relies on cl_consensus, to compute a consensus clustering among several internally computed hierarchies and partitions. The algorithm works as follows:

An ensemble of R clusterings in a fixed number of groups k in kvals are computed on subsamples of the data. To reflect the potential data perculiarities, clustering are obtained on weighted subsamples using Baysian resampling.
A consensus among the clusterings obtained in step one is computed. The number of clusters in the consensus may exceed the one in the ensemble of clusterings computed in step one. Setting k.fixed to TRUE set the maximal number of cluster in the consensus to the number of clusters k in the ensemble.
Cluster quality indices are computed for the obtained consensus.
Step 1 to 3 are repeated for each number of groups specified in kvals

Value

A consClust object with the following components:

`clustering`	The retained clustering for each number of groups.
`stats`	A `matrix` containing the clustering statistics of each cluster solution.
`kvals`	The number of computed clusters.
`call`	The used function calls.
`ensemblePartitions`	A list containing the partitions or hierarchies used to obtain the consensus.

References

Monti, S., Tamayo, P., Mesirov, J., Golub, T. (2003). Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning, 52, 1 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1023/A:1023949509487")}

Unterlerchner, L., Studer, M. (2026). What are We Looking For? A Comparative Review of Clustering Algorithms and Cluster Quality Indices For Sequence Analysis. LIVES Working Papers 108 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.12682/lives.2296-1658.2026.108")}

Examples

# Loading illustrative data
data(mvad)

# Creating state sequence object
mvad.seq <- seqdef(mvad[1:200, 17:86])

# Computing dissimilarities using LCS measure
diss <- seqdist(mvad.seq, method="LCS")

## Computing consensus clustering using PAM and Ward (D)

pamWardConsClust <- consClust(diss,
                              kvals = 2:6, 
                              base.clust = c("pam", "ward.D"),
                              R = 20,
                              k.fixed = TRUE,
                              agg.method = "cRand")

## Showing the cluster quality measures. 
pamWardConsClust

## Plotting normalized values for easier identification 
## of minimum and maximum values, with a transparent legend background.
plot(pamWardConsClust, norm="range")


# Plotting sequences in 6 groups
par(mar = c(2.5,2,1.8,1.2))

seqdplot(mvad.seq, 
         group = pamWardConsClust$clustering$cluster6, 
         border = NA)

WeightedCluster documentation built on April 27, 2026, 3:04 a.m.