K-Medoids Clustering


Compute a k-medoids partition of a dissimilarity object.


kmedoids(x, k)



a dissimilarity object inheriting from class "dist", or a square matrix of pairwise object-to-object dissimilarity values.


an integer giving the number of classes to be used in the partition.


Let d denote the pairwise object-to-object dissimilarity matrix corresponding to x. A k-medoids partition of x is defined as a partition of the numbers from 1 to n, the number of objects in x, into k classes C_1, …, C_k such that the criterion function L = ∑_l \min_{j \in C_l} ∑_{i \in C_l} d_{ij} is minimized.

This is an NP-hard optimization problem. PAM (Partitioning Around Medoids, see Kaufman & Rousseeuw (1990), Chapter 2) is a very popular heuristic for obtaining optimal k-medoids partitions, and provided by pam in package cluster.

kmedoids is an exact algorithm based on a binary linear programming formulation of the optimization problem (e.g., Gordon & Vichi (1998), [P4']), using lp from package lpSolve as solver. Depending on available hardware resources (the number of constraints of the program is of the order n^2), it may not be possible to obtain a solution.


An object of class "kmedoids" representing the obtained partition, which is a list with the following components.


the class ids of the partition.


the indices of the medoids.


the value of the criterion function of the partition.


L. Kaufman and P. J. Rousseeuw (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

A. D. Gordon and M. Vichi (1998). Partitions of partitions. Journal of Classification, 15, 265–285.

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.

comments powered by Disqus