hartiganwong | R Documentation |
Initializes the cluster prototypes matrix using the Hartigan-Wong's algorithm (Hartigan & Wong, 1979).
hartiganwong(x, k)
x |
a numeric vector, data frame or matrix. |
k |
an integer specifying the number of clusters. |
Firstly, the algorithm computes the center of gravity of data and the distances of data objects to this center. Then, it sorts the data set in any order of the computed distances. The prototypes of k clusters are determined by using the formula (1 + (i-1) (n/k)), where i and n stand for the index of a cluster and the number of data rows, respectively. This algorithm leads to increase in the computational cost due to complexity of sorting, which is O(n log(n)) (Celebi et al, 2013).
an object of class ‘inaparc’, which is a list consists of the following items:
v |
a numeric matrix containing the initial cluster prototypes. |
ctype |
a string for the type of used centroid to determine the cluster prototypes. It is ‘obj’ with this function because the generated prototype matrix contains the selected objects. |
call |
a string containing the matched function call that generates this ‘inaparc’ object. |
Zeynel Cebeci, Cagatay Cebeci
Hartigan, J.A. & Wong, W.A., (1979). Algorithm AS 136: A K-means clustering algorithm, J of the Royal Statistical Society, C 28 (1): 100-108.
Celebi, M.E., Kingravi, H.A. & Vela, P.A. (2013). A comparative study of efficient initialization methods for the K-means clustering algorithm, Expert Systems with Applications, 40 (1): 200-210. arXiv:https://arxiv.org/pdf/1209.1960.pdf
aldaoud
,
ballhall
,
crsamp
,
firstk
,
forgy
,
inofrep
,
inscsf
,
insdev
,
kkz
,
kmpp
,
ksegments
,
ksteps
,
lastk
,
lhsmaximin
,
lhsrandom
,
maximin
,
mscseek
,
rsamp
,
rsegment
,
scseek
,
scseek2
,
spaeth
,
ssamp
,
topbottom
,
uniquek
,
ursamp
data(iris) res <- hartiganwong(iris[,1:4], k=5) v <- res$v print(v)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.