Hmeans: Perform parallel hierarchical clustering on a data matrix.

Description Usage Arguments Value Author(s) Examples

View source: R/clusternor.R

Description

A recursive (not acutally implemented as recursion) partitioning of data into two disjoint sets at every level as described in https://en.wikipedia.org/wiki/Hierarchical_clustering

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
Hmeans(
  data,
  kmax,
  nrow = -1,
  ncol = -1,
  iter.max = 20,
  nthread = -1,
  init = c("forgy"),
  tolerance = 1e-06,
  dist.type = c("eucl", "cos", "sqeucl", "taxi"),
  min.clust.size = 1
)

Arguments

data

Data file name on disk (NUMA optmized) or In memory data matrix

kmax

The maximum number of centers

nrow

The number of samples in the dataset

ncol

The number of features in the dataset

iter.max

The maximum number of iteration of k-means to perform

nthread

The number of parallel threads to run

init

The type of initialization to use c("forgy") or initial centers

tolerance

The convergence tolerance for k-means at each hierarchical split

dist.type

What dissimilarity metric to use

min.clust.size

The minimum size of a cluster when it cannot be split

Value

A list of lists containing the attributes of the output. cluster: A vector of integers (from 1:k) indicating the cluster to which each point is allocated. centers: A matrix of cluster centres. size: The number of points in each cluster. iter: The number of (outer) iterations.

Author(s)

Disa Mhembere <disa@cs.jhu.edu>

Examples

1
2
3
iris.mat <- as.matrix(iris[,1:4])
kmax <- length(unique(iris[, dim(iris)[2]])) # Number of unique classes
kms <- Hmeans(iris.mat, kmax)

clusternor documentation built on March 26, 2020, 7:31 p.m.