cca: K-Median Cluster Component Analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/cca.R


K-Median Cluster Component Analysis, a distribution-free soft-clustering method for preference rankings.


cca(X, k, control = ccacontrol(...), ...)



A n by m data matrix containing preference rankings, in which there are n judges and m objects to be judged. Each row is a ranking of the objects which are represented by the columns.


The number of cluster components


a list of options that control details of the cca algorithm governed by the function ccacontrol. The options govern maximum number of iterations of cca (itercca=1 is the default), the algorithm chosen to compute the median ranking (default, "quick"), and other options related to the consrank algorithm, which is called by cca


arguments passed bypassing ccacontrol


The user can use any algorithm implemented in the consrank function from the ConsRank package. All algorithms allow the user to set the option 'full=TRUE' if the median ranking(s) must be searched in the restricted space of permutations instead of in the unconstrained universe of rankings of n items including all possible ties. There are two classification uncertainty measures: Us and Uprods. "Us" is the geometric mean of the membership probabilities of each individual, normalized in such a way that in the case of maximum uncertainty Us=1. "Ucca" is the average of all the "Us". "Uprods" is the product of the membership probabilities of each individual, normalized in such a way that in the case of maximum uncertainty Uprods=1. "Uprodscca" is the average of all the "Uprods".


An object of the class "cca". It contains:

pk the membership probability matrix
clc cluster centers
oclc cluster centers in terms of orderings
idc crisp partition: id of the cluster component associated with the highest membership probability
Hcca Global homogeneity measure (tau_X rank correlation coefficient)
hk Homogeneity within cluster
props estimated proportion of cases within cluster
Us Uncertainty measure per-individual (see details)
Ucca Global uncertainty measure
Uprods Uncertainty measure per-individual (see details)
Uprodscca Global uncertainty measure
consrankout complete output of rank aggregation algorithm, containing eventually multiple median rankings


Antonio D'Ambrosio


D'Ambrosio, A. and Heiser, W.J. (2019). A Distribution-free Soft Clustering Method for Preference Rankings. Behaviormetrika , vol. 46(2), pp. 333–351, DOI: 10.1007/s41237-018-0069-5

Heiser W.J., and D'Ambrosio A. (2013). Clustering and Prediction of Rankings within a Kemeny Distance Framework. In Berthold, L., Van den Poel, D, Ultsch, A. (eds). Algorithms from and for Nature and Life.pp-19-31. Springer international. DOI: 10.1007/978-3-319-00035-0_2.

Ben-Israel, A., and Iyigun, C. (2008). Probabilistic d-clustering. Journal of Classification, 25(1), pp.5-26. DOI: 10.1007/s00357-008-9002-z

See Also




set.seed(135) #for reproducibility
# CCA with four components
ccares <- cca(Irish$rankings, 4, itercca=10)

ConsRankClass documentation built on Sept. 28, 2021, 5:10 p.m.