crsamp: Initialization of cluster prototypes using the centers of...

View source: R/inaparc.R

crsampR Documentation

Initialization of cluster prototypes using the centers of random samples

Description

Initializes the cluster prototypes matrix using the centers of r data objects. The options for centers are mean and median of the sampled objects in addition to the objects nearest to the mean of the sampled objects.

Usage

crsamp(x, k, r, ctype)

Arguments

x

a numeric vector, data frame or matrix.

k

an integer specifying the number of clusters.

r

an integer for the number of objects to be sampled from the data set. If missing, the default value is 2.

ctype

a string for the type of centroids to be computed. The options are ‘avg’ for average, ‘med’ for median or ‘obj’ for the object nearest to the average. The default is ‘obj’.

Details

Instead of sampling only one random object as the function rsamp does, the function arsamp randomly samples r data objects, and then computes the average and median of these sampled objects. The nearest data object to the mean of sampled objects is also found. If ctype is avg the mean of the sampled r objects is assigned as the prototype of first cluster. When ctype is med the median of the sampled r objects is assigned as the prototype of first cluster. If the ctype is obj, the nearest object to the mean of sampled r objects is assigned as the the prototype of first cluster. The same process is repeated for all of the remaining clusters. The logic behind this novel technique is to avoid to select the outliers in the data set which may occur with random sampling for only one object.

Value

an object of class ‘inaparc’, which is a list consists of the following items:

v

a numeric matrix containing the initial cluster prototypes.

ctype

a string for the type of used centroid to build the cluster prototypes matrix.

call

a string containing the matched function call that generates this ‘inaparc’ object.

Author(s)

Zeynel Cebeci, Cagatay Cebeci

See Also

aldaoud, ballhall, firstk, forgy, hartiganwong, inofrep, inscsf, insdev, kkz, kmpp, ksegments, ksteps, lastk, lhsmaximin, lhsrandom, maximin, mscseek, rsamp, rsegment, scseek, scseek2, spaeth, ssamp, topbottom, uniquek, ursamp

Examples

data(iris)
# Prototypes are the objects nearest to the mean of
# five randomly sampled objects for each cluster
res <- crsamp(iris[,1:4], k=5, r=5, ctype="obj")
v <- res$v
print(v)

inaparc documentation built on June 16, 2022, 5:09 p.m.