# dpmeans: DP-Means Clustering In T4cluster: Tools for Cluster Analysis

## Description

DP-means is a non-parametric clustering method motivated by DP mixture model in that the number of clusters is determined by a parameter λ. The larger the λ value is, the smaller the number of clusters is attained. In addition to the original paper, we added an option to randomly permute an order of updating for each observation's membership as a common heuristic in the literature of cluster analysis.

## Usage

 1 dpmeans(data, lambda = 0.1, ...)

## Arguments

 data an (n\times p) matrix of row-stacked observations. lambda a threshold to define a new cluster (default: 0.1). ... extra parameters including maxiterthe maximum number of iterations (default: 10). epsthe stopping criterion for iterations (default: 1e-5). permutea logical; TRUE if random order for update is used, FALSE otherwise (default).

## Value

a named list of S3 class T4cluster containing

cluster

a length-n vector of class labels (from 1:k).

algorithm

name of the algorithm.

## References

\insertRef

kulis_revisiting_2012T4cluster

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # ------------------------------------------------------------- # clustering with 'iris' dataset # ------------------------------------------------------------- ## PREPARE data(iris) X = as.matrix(,1:4]) lab = as.integer(as.factor(,5])) ## EMBEDDING WITH PCA X2d = Rdimtools::do.pca(X, ndim=2)$Y ## CLUSTERING WITH DIFFERENT LAMBDA VALUES dpm1 = dpmeans(X, lambda=1)$cluster dpm2 = dpmeans(X, lambda=5)$cluster dpm3 = dpmeans(X, lambda=25)$cluster ## VISUALIZATION opar <- par(no.readonly=TRUE) par(mfrow=c(1,4), pty="s") plot(X2d, col=lab, pch=19, main="true label") plot(X2d, col=dpm1, pch=19, main="dpmeans: lambda=1") plot(X2d, col=dpm2, pch=19, main="dpmeans: lambda=5") plot(X2d, col=dpm3, pch=19, main="dpmeans: lambda=25") par(opar)

T4cluster documentation built on Aug. 16, 2021, 9:07 a.m.