lhsmaximin: Initialization of cluster prototypes using Maximin LHS

View source: R/inaparc.R

lhsmaximinR Documentation

Initialization of cluster prototypes using Maximin LHS

Description

Initializes the cluster prototypes matrix using the Maximin version of Latin Hypercube Sampling (LHS). A square grid containing possible sample points is a Latin Square (LS) if there is only one sample in each row and each column. LHS is a generalized version of LS, which has been developed to generate a distribution of collections of parameter values from a multidimensional distribution. LHS generates more efficient estimates of desired parameters than simple Monte Carlo sampling (Carnell, 2016).

Usage

lhsmaximin(x, k, ncp)

Arguments

x

a numeric vector, data frame or matrix.

k

an integer specifying the number of clusters.

ncp

an integer determining the number of candidate points used in the search by maximin LHS algorithm.

Details

LHS aims at initial cluster centers whose coordinates are well spread out in the individual dimensions (Borgelt, 2005). It is the generalization of Latin Square for an arbitrary number of dimensions (features). When sampling a function of p features, the range of each feature is divided into k equally probable intervals. k samples are then drawn such that a Latin Hypercube is created.

The current version of the function lhsmaximin in this package uses the results from the maximinLHS function from the ‘lhs’ library created by Carnell (2016). Once the uniform samples are created by the maximinLHS, they are transformed to normal distribution samples by using the quantile functions. But all the features in the data set may not be normally distributed, instead they may fit to different distributions. In such cases, the transformation for any feature should be specisific to its distribution. Determination of the distribution types of features is planned in the future versions of the function ‘lhsmaximin’.

Value

an object of class ‘inaparc’, which is a list consists of the following items:

v

a numeric matrix containing the initial cluster prototypes.

ctype

a string for the type of used centroid to determine the cluster prototypes. It is ‘obj’ with this function.

call

a string containing the matched function call that generates this ‘inaparc’ object.

Author(s)

Zeynel Cebeci, Cagatay Cebeci

References

Borgelt, C., (2005). Prototype-based classification and clustering. Habilitationsschrift zur Erlangung der Venia legendi fuer Informatik, vorgelegt der Fakultaet fuer Informatik der Otto-von-Guericke-Universitaet Magdeburg, Magdeburg, 22 June 2005. url:https://borgelt.net/habil/pbcc.pdf

Carnell, R., (2016). lhs: Latin Hypercube Samples. R package version 0.14. https://CRAN.R-project.org/package=lhs

See Also

aldaoud, ballhall, crsamp, firstk, forgy, hartiganwong, inofrep, inscsf, insdev, kkz, kmpp, ksegments, ksteps, lastk, lhsrandom, maximin, mscseek, rsamp, rsegment, scseek, scseek2, spaeth, ssamp, topbottom, uniquek, ursamp

Examples

data(iris)
res <- lhsmaximin(iris[,1:4], k=5)
v <- res$v
print(v)

inaparc documentation built on June 16, 2022, 5:09 p.m.