Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/blurredmeanshift.R
This function implements the blurring mean shift algorithm, which approximates the standard mean shift algorithm. Because it recursively updates the entire sample at each iteration, the blurring version of the mean shift algorithm is often faster than the standard version (especially if the standard mean shift algorithm is run using a single core).
1 2 | bmsClustering(X, h = NULL, kernel = "epanechnikovKernel",
tol.stop = 1e-06, max.iter = 100, tol.epsilon = 0.001)
|
X |
a p \times n matrix containing n ≥ 1 p-dimensional numeric vectors stored as columns. Each column of |
h |
a strictly positive bandwidth parameter. |
kernel |
a kernel function (as a character string). The following kernels are supported:
The use of the Epanechnikov kernel is recommended when using the blurring version of the mean shift algorithm. |
tol.stop |
a strictly positive tolerance parameter. The mean shift algorithm stops when its update generates a step of length smaller than |
max.iter |
a strictly positive integer specifying the maximum number of iterations before the algorithm is forced to stop. |
tol.epsilon |
a strictly positive tolerance parameter. Points that are less than |
It is generally recommended to standardize X
so that each variable has
unit variance prior to running the algorithm on the data.
Roughly speaking, larger values of h
produce a coarser clustering (i.e. few and large clusters). For sufficiently large values of h
, the algorithm produces a unique cluster containing all the data points. Smaller values of h
produce a finer clustering (i.e. many small clusters). For sufficiently small values of h
, each cluster that is identified by the algorithm will contain exactly one data point.
If h
is not specified in the function call, then h
is by default set to the 30th percentile of the empirical distribution of distances between the columns of X
, i.e. h=quantile( dist( t( X ) ), 0.3 )
.
In their implementation, gaussianKernel
and exponentialKernel
are rescaled to assign probability of at least 0.99 to the unit interval [0,1]. This ensures that all the kernels are roughly on the same scale.
When using the blurring version of the mean shift algorithm, it is generally recommended to use a compactly supported kernel. In particular, the algorithm is guaranteed to converge in finitely many iterations with the Epanechnikov kernel.
The function invisibly returns a list with names
components |
a matrix containing the modes/cluster representatives by column. |
labels |
an integer vector of cluster labels. |
Mattia Ciollaro and Daren Wang.
Carreira-Perpinan, M. A. (2015) A review of mean-shift algorithms for clustering. arXiv http://arxiv.org/abs/1503.00687
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ## an example using the iris dataset
## help( iris )
## prepare data matrix
iris.data <- t( iris[,c( "Sepal.Length", "Sepal.Width" )] )
## run blurring mean shift algorithm
clustering <- bmsClustering( iris.data )
print( clustering )
## plot the clusters
## Not run:
plot( iris.data[1,], iris.data[2,], col=clustering$labels+2, cex=0.8,
pch=16, xlab="Sepal.Length", ylab="Sepal.Width" )
points( clustering$components[1,], clustering$components[2,],
col=2+( 1:ncol( clustering$components ) ), cex=1.8, pch=16 )
## End(Not run)
|
Loading required package: parallel
Loading required package: wavethresh
Loading required package: MASS
WaveThresh: R wavelet software, release 4.6.8, installed
Copyright Guy Nason and others 1993-2016
Note: nlevels has been renamed to nlevelsWT
Running blurring mean-shift algorithm...
Blurring mean-shift algorithm ran successfully.
Finding clusters...
The algorithm found 3 clusters.
$components
mode1 mode2 mode3
Sepal.Length 5.022179 6.201654 7.743231
Sepal.Width 3.387924 2.889108 3.780952
$labels
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[38] 1 1 1 1 2 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2
[112] 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[149] 2 2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.