ballhall | R Documentation |
Initializes the prototypes of clusters by using the cluster seeding algorithm which has been proposed by Ball & Hall (1967).
ballhall(x, k, tv)
x |
a numeric vector, data frame or matrix. |
k |
an integer specifying the number of clusters. |
tv |
a number to be used as T, a threshold distance value. It is directly input by the user. Also it is possible to compute T with the following options of
|
In the Ball and Hall's algorithm (Ball & Hall, 1967), the center of gravity of data is assigned as the prototype of first cluster. It then passes the data objects in arbitrary order and takes an object as the next prototype if it is T units far from the previously selected prototypes. The purpose of using T, the distance threshold, is to make the cluster protoypes at least T units away from each other. Ball & Hall's method may be sensitive to the order of data, and moreover, deciding for an appropriate value of T is is also difficult (Celebi et al, 2013). As the solutions to this problem, the function ballhall
in this package computes a T value using some distance measures, if it is not specified by the user (for details, see the section ‘Arguments’ above.)
an object of class ‘inaparc’, which is a list consists of the following items:
v |
a numeric matrix containing the initial cluster prototypes. |
ctype |
a string for the type of used centroid. It is ‘obj’ with this function because the created cluster prototypes matrix contains the selected objects. |
call |
a string containing the matched function call that generates this ‘inaparc’ object. |
Zeynel Cebeci, Cagatay Cebeci
Ball, G.H. & Hall, D.J. (1967). A clustering technique for summarizing multivariate data, Systems Res. & Behavioral Sci., 12 (2): 153-155.
Celebi, M.E., Kingravi, H.A. & Vela, P.A. (2013). A comparative study of efficient initialization methods for the K-means clustering algorithm, Expert Systems with Applications, 40 (1): 200-210. arXiv:https://arxiv.org/pdf/1209.1960.pdf
aldaoud
,
crsamp
,
firstk
,
hartiganwong
,
inofrep
,
inscsf
,
insdev
,
kkz
,
kmpp
,
ksegments
,
ksteps
,
lastk
,
lhsmaximin
,
lhsrandom
,
maximin
,
mscseek
,
rsamp
,
rsegment
,
scseek
,
scseek2
,
spaeth
,
ssamp
,
topbottom
,
uniquek
,
ursamp
,
data(iris) # Run with a user described threshold value v1 <- ballhall(x=iris[,1:4], k=5, tv=0.6)$v print(v1) # Run with the internally computed default threshold value v2 <- ballhall(x=iris[,1:4], k=5)$v print(v2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.