| consClust | R Documentation |
Compute consensus clustering for different number of clusters (Monti, 2003). The function further computes cluster quality and consensus agreement measures.
consClust(diss,
base.clust = "pam",
R = 100,
kvals = 2:15,
cons.method = "SE",
membership = "crisp",
k.fixed = TRUE,
agg.method = "cRand",
keep.ensemble = TRUE,
parallel = FALSE,
progressbar = TRUE)
diss |
A dissimilarity matrix or a |
base.clust |
Character. Clustering algorithms used to compute the ensemble of partitions and hierarchies. May be a combination of |
R |
Numeric. The number of partitions or hierarchies to compute for consensus clustering. |
kvals |
Numeric vector. The number of clusters to compute, default |
cons.method |
Character. The consensus clustering method to use, can be one of |
membership |
Character. If |
k.fixed |
Logical. If |
agg.method |
Character. The consensus agreement measures to compute, may be a comination of |
keep.ensemble |
Logical. If |
parallel |
Logical. Whether to initialize the parallel processing of the |
progressbar |
Logical. Whether to initialize a progress bar using the |
consClust relies on cl_consensus, to compute a consensus clustering among several internally computed hierarchies and partitions. The algorithm works as follows:
An ensemble of R clusterings in a fixed number of groups k in kvals are computed on subsamples of the data. To reflect the potential data perculiarities, clustering are obtained on weighted subsamples using Baysian resampling.
A consensus among the clusterings obtained in step one is computed. The number of clusters in the consensus may exceed the one in the ensemble of clusterings computed in step one. Setting k.fixed to TRUE set the maximal number of cluster in the consensus to the number of clusters k in the ensemble.
Cluster quality indices are computed for the obtained consensus.
Step 1 to 3 are repeated for each number of groups specified in kvals
A consClust object with the following components:
clustering |
The retained clustering for each number of groups. |
stats |
A |
kvals |
The number of computed clusters. |
call |
The used function calls. |
ensemblePartitions |
A list containing the partitions or hierarchies used to obtain the consensus. |
Monti, S., Tamayo, P., Mesirov, J., Golub, T. (2003). Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning, 52, 1 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1023/A:1023949509487")}
Unterlerchner, L., Studer, M. (2026). What are We Looking For? A Comparative Review of Clustering Algorithms and Cluster Quality Indices For Sequence Analysis. LIVES Working Papers 108 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.12682/lives.2296-1658.2026.108")}
# Loading illustrative data
data(mvad)
# Creating state sequence object
mvad.seq <- seqdef(mvad[1:200, 17:86])
# Computing dissimilarities using LCS measure
diss <- seqdist(mvad.seq, method="LCS")
## Computing consensus clustering using PAM and Ward (D)
pamWardConsClust <- consClust(diss,
kvals = 2:6,
base.clust = c("pam", "ward.D"),
R = 20,
k.fixed = TRUE,
agg.method = "cRand")
## Showing the cluster quality measures.
pamWardConsClust
## Plotting normalized values for easier identification
## of minimum and maximum values, with a transparent legend background.
plot(pamWardConsClust, norm="range")
# Plotting sequences in 6 groups
par(mar = c(2.5,2,1.8,1.2))
seqdplot(mvad.seq,
group = pamWardConsClust$clustering$cluster6,
border = NA)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.