msClustering: Function to perform clustering using the mean shift...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/meanshift.R

Description

This function implements the mean shift algorithm. The algorithm locates the modes of a kernel density estimator and associates each data point to exactly one of the modes, thus effectively clustering the data.

Usage

1
2
msClustering(X, h = NULL, kernel = "epanechnikovKernel",
tol.stop = 1e-06, tol.epsilon = 0.001, multi.core = FALSE)

Arguments

X

a p \times n matrix containing n ≥ 1 p-dimensional numeric vectors stored as columns. Each column of X represents a sample point.

h

a strictly positive bandwidth parameter.

kernel

a kernel function (as a character string). The following kernels are supported:

  • Epanechnikov: K(x) = \frac{3}{2}(1-x^2)I_{[0,1]}(x) ; kernel="epanechnikovKernel"

  • cubic: K(x) = 4(1-x)^3I_{[0,1]}(x) ; kernel="cubicKernel"

  • Gaussian: K(x) = √{\frac{2}{π}}e^{-\frac{x^2}{2}}I_{[0,∞)}(x) ; kernel="gaussianKernel"

  • exponential K(x) = e^{-x}I_{[0,∞)}(x) ; kernel="exponentialKernel".

tol.stop

a strictly positive tolerance parameter. The algorithm stops when all of the updates generate steps of length smaller than tol.stop. tol.stop should be considerably smaller than tol.epsilon.

tol.epsilon

a strictly positive tolerance parameter. Points that are less than tol.epsilon- separated are grouped in the same cluster once the algorithm stops.

multi.core

logical. If TRUE, the mean shift algorithm is parallelized.

Details

It is generally recommended to standardize X so that each variable has unit variance prior to running the algorithm on the data.

Roughly speaking, larger values of h produce a coarser clustering (i.e. few and large clusters). For sufficiently large values of h, the algorithm produces a unique cluster containing all the data points. Smaller values of h produce a finer clustering (i.e. many small clusters). For sufficiently small values of h, each cluster that is identified by the algorithm will contain exactly one data point.

If h is not specified in the function call, then h is by default set to the 30th percentile of the empirical distribution of distances between the columns of X, i.e. h=quantile( dist( t( X ) ), 0.3 ).

In their implementation, gaussianKernel and exponentialKernel are rescaled to assign probability of at least 0.99 to the unit interval [0,1]. This ensures that all the kernels are roughly on the same scale.

To specify the number of cores when multi.core=TRUE, the option mc.cores needs to be set with options( mc.cores=n.cores ), where n.cores is the number of cores that the mean shift algorithm is allowed to use for parallel computation.

Value

The function invisibly returns a list with names

components

a matrix containing the modes/cluster representatives by column.

labels

an integer vector of cluster labels.

Author(s)

Mattia Ciollaro and Daren Wang

References

Carreira-Perpinan, M. A. (2015) A review of mean-shift algorithms for clustering. arXiv http://arxiv.org/abs/1503.00687

See Also

bmsClustering

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## an example using the iris dataset
## help( iris )

## prepare data matrix (a subset of the iris dataset)
set.seed( 2 )
indices <- sample( 1:nrow( iris ), 80 )
iris.data <- t( iris[indices,c( "Sepal.Length", "Sepal.Width" )] )

## run mean shift algorithm
clustering <- msClustering( iris.data, h=0.8 )
print( clustering )

## plot the clusters
## Not run: 
plot( iris.data[1,], iris.data[2,], col=clustering$labels+2, cex=0.8,
pch=16, xlab="Sepal.Length", ylab="Sepal.Width" )
points( clustering$components[1,], clustering$components[2,],
col=2+( 1:ncol( clustering$components ) ), cex=1.8, pch=16 )
## End(Not run)

## using multiple cores (2)
## Not run: 
options( mc.cores=2 )
clustering.mc <- msClustering( iris.data, multi.core=TRUE )
## End(Not run)

Example output

Loading required package: parallel
Loading required package: wavethresh
Loading required package: MASS
WaveThresh: R wavelet software, release 4.6.8, installed

Copyright Guy Nason and others 1993-2016

Note: nlevels has been renamed to nlevelsWT


Running mean-shift algorithm...


  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Mean-shift algorithm ran successfully.

Finding clusters...

The algorithm found 2 clusters.

$components
                mode1    mode2
Sepal.Length 5.003879 6.327882
Sepal.Width  3.376633 2.999923

$labels
 [1] 1 2 2 1 2 2 1 2 2 2 2 1 2 2 2 2 2 1 2 1 2 1 2 1 1 2 2 2 2 1 1 2 2 2 2 2 2 1
[39] 2 2 2 2 1 1 2 2 2 1 2 2 1 2 2 2 1 2 2 2 2 2 2 2 2 1 2 1 2 1 1 1 2 2 1 1 1 2
[77] 1 2 1 1


Running mean-shift algorithm...

Using 2 cores...

Mean-shift algorithm ran successfully.

Finding clusters...

The algorithm found 3 clusters.

MeanShift documentation built on May 29, 2017, 2:27 p.m.