Implements methods for clustering mixed-type data, specifically combinations of continuous and nominal data. Special attention is paid to the often-overlooked problem of equitably balancing the contribution of the continuous and categorical variables. This package implements KAMILA clustering, a novel method for clustering mixed-type data in the spirit of k-means clustering. It does not require dummy coding of variables, and is efficient enough to scale to rather large data sets. Also implemented is Modha-Spangler clustering, which uses a brute-force strategy to maximize the cluster separation simultaneously in the continuous and categorical variables.
|Author||Alexander Foss [aut, cre], Marianthi Markatou [aut]|
|Date of publication||2016-08-19 00:46:49|
|Maintainer||Alexander Foss <email@example.com>|
classifyKamila: Classify new data into existing KAMILA clusters
dummyCodeFactorDf: Dummy coding of a data frame of factor variables
genMixedData: Generate simulated mixed-type data with cluster structure.
gmsClust: A general implementation of Modha-Spangler clustering for...
kamila: KAMILA clustering of mixed-type data.
kamila-package: Clustering for mixed continuous and categorical data sets
wkmeans: Weighted k-means for mixed-type data