firstk: Initialization of cluster prototypes using the first k...

View source: R/inaparc.R

firstkR Documentation

Initialization of cluster prototypes using the first k objects

Description

Initializes the cluster prototypes matrix using the first k objects at the top of data set.

Usage

firstk(x, k)

Arguments

x

a numeric vector, data frame or matrix.

k

an integer specifying the number of clusters.

Details

The technique so-called the first method of MacQueen (MacQueen, 1967) that simply selects the first k objects as the initial centroids. It is sensitive to the order of data (Celebi et al, 2013). If the data set is already sorted in any order it may result with no good initial prototypes because the data objects are close to each other in a sorted data set. Therefore, shuffling of the data set as a pre-processing step may improve the quality with this initialization technique.

Value

an object of class ‘inaparc’, which is a list consists of the following items:

v

a numeric matrix containing the initial cluster prototypes.

ctype

a string representing the type of used centroid to build prototype matrix. Its value is ‘obj’ with this function because it returns the selected objects.

call

a string containing the matched function call that generates the object.

Author(s)

Zeynel Cebeci, Cagatay Cebeci

References

MacQueen, J.B. (1967). Some methods for classification and analysis of multivariate observations, in Proc. of 5th Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, University of California Press, 1: 281-297. url:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.308.8619&rep=rep1&type=pdf

Celebi, M.E., Kingravi, H.A. & Vela, P.A. (2013). A comparative study of efficient initialization methods for the K-means clustering algorithm, Expert Systems with Applications, 40 (1): 200-210. arXiv:https://arxiv.org/pdf/1209.1960.pdf

See Also

aldaoud, ballhall, crsamp, forgy, hartiganwong, inofrep, inscsf, insdev, kkz, kmpp, ksegments, ksteps, lastk, lhsmaximin, lhsrandom, maximin, mscseek, rsamp, rsegment, scseek, scseek2, spaeth, ssamp, topbottom, uniquek, ursamp

Examples

data(iris)
res <- firstk(x=iris[,1:4], k=5)
v <- res$v
print(v)

inaparc documentation built on June 16, 2022, 5:09 p.m.