K-mH: Merging K-means with hierarchical clustering for identifying...

Description Usage Arguments Author(s) References Examples

Description

A hybrid non-parametric clustering approach that amalgamates two methods (K-means and Hierarchical) to identify general-shaped clusters and that can apply to larger dataset

Usage

1
2
3
kmH <- function(x, kmns.results = NULL, nstart =  prod(dim(x)),B = 100, 
                M = min(round(0.1*sqrt(prod(dim(x)))), 10), L = 3, 
                verbose = F, maxclus = max(sqrt(nrow(x)),50), hmap.pdf.file = NULL, desired.ncores = 2,...)

Arguments

x

vector or matrix

kmns.results

Default is NULL. If k-means results is provided will use it, if not will run k-means until Kmax and estimate the number ofgroups.

nstart

Number of initialization to be use in k-means

B

number of samples (see paper)

M

number of K_0s tried (see paper, default seems reasonable)

L

number of K_*s tried (see paper, default 3)

verbose

Show progress. Default is FALSE.

maxclus

Default is NULL.If not provided will choose Kmax = max50,sqrt(n)

hmap.pdf.file

file to store the pdf of the heatmap (NULL, skipped by default)

desired.ncores

Number of desired cores. By default is min(detectCores(),desired.ncores=2)

Author(s)

Israel Almodóvar-Rivera and Ranjan Maitra.

References

Almodóvar-Rivera, I., & Maitra, R. (2018). Kernel-estimated Nonparametric Overlap-Based Syncytial Clustering. arXiv preprint arXiv:1805.09505. Peterson, A. D., Ghosh, A. P., & Maitra, R. (2018). Merging K‐means with hierarchical clustering for identifying general‐shaped groups. Stat, 7(1), e172.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
set.seed(787)
data("Bullseye")
kk <- kmH(x = Bullseye[-3],maxclus = sqrt(nrow(Bullseye)),verbose = TRUE,nstart = 50)
Khat <- which.max(unlist(kk$kmns.kmH$kmeans.results$KL))
## extract k-means solution
km <- kk$kmns.kmH$kmeans.results$kmns.results[[Khat]]
Bullseye$IdsKmeans <- km$cluster
Bullseye$IdsKmH<- kk$final.partition

par(mfrow=c(1,3))
with(Bullseye,plot(x = x,y = y, col=Ids,main="True"))
with(Bullseye,plot(x = x,y = y, col=IdsKmeans,main="k-means"))
with(Bullseye,plot(x = x,y = y, col=IdsKmH,main="K-mH"))

ialmodovar/RSynC documentation built on Jan. 25, 2020, 8:41 p.m.