K-mH: Merging K-means with hierarchical clustering for identifying...

KmHR Documentation

Merging K-means with hierarchical clustering for identifying general-shaped groups

Description

A hybrid non-parametric clustering approach that amalgamates two methods (K-means and Hierarchical) to identify general-shaped clusters and that can apply to larger dataset

Usage

kmH(x, kmns.results = NULL, nstart =  prod(dim(x)),B = 100,
M = min(round(0.1*sqrt(prod(dim(x)))), 10), L = 3,verbose = FALSE,
maxclus = max(sqrt(nrow(x)),50),
hmap.pdf.file = NULL, desired.ncores = 2,...)

Arguments

x

vector or matrix

kmns.results

Default is NULL. If k-means results is provided will use it, if not will run k-means until maxclus and estimate the number of groups.

nstart

Number of initialization to be use in k-means

B

number of samples (see paper)

M

number of K_0s tried (see paper, default seems reasonable)

L

number of K_*s tried (see paper, default 3)

verbose

Show progress. Default is FALSE.

maxclus

Default is NULL.If not provided will choose Kmax = max50,sqrt(n)

hmap.pdf.file

file to store the pdf of the heatmap (NULL, skipped by default)

desired.ncores

Number of desired cores. By default is min(detectCores(),desired.ncores=2)

...

Additional arguments to be pass for the heatmap

Author(s)

Israel Almodovar-Rivera and Ranjan Maitra.

References

Almodovar-Rivera, I., & Maitra, R. (2018). Kernel-estimated Nonparametric Overlap-Based Syncytial Clustering. arXiv preprint arXiv:1805.09505.

Peterson, A. D., Ghosh, A. P., & Maitra, R. (2018). Merging K-means with hierarchical clustering for identifying general‐shaped groups. Stat, 7(1), e172.

Examples


set.seed(787)
data("Bullseye")
kk <- kmH(x = Bullseye[-3],maxclus = sqrt(nrow(Bullseye)),verbose = TRUE)
Khat <- which.max(unlist(kk$kmns.kmH$kmeans.results$KL))
## extract k-means solution
km <- kk$kmns.kmH$kmeans.results$kmns.results[[Khat]]
Bullseye$IdsKmeans <- km$cluster
Bullseye$IdsKmH<- kk$final.partition

par(mfrow=c(1,3))
with(Bullseye,plot(x = x,y = y, col=Ids,main="True"))
with(Bullseye,plot(x = x,y = y, col=IdsKmeans,main="k-means"))
with(Bullseye,plot(x = x,y = y, col=IdsKmH,main="K-mH"))



ialmodovar/SynClustR documentation built on July 7, 2023, 12:18 a.m.