View source: R/clustering_nmshift.R
| riem.nmshift | R Documentation |
Given N observations X_1, X_2, …, X_N \in \mathcal{M}, perform clustering of the data based on the nonlinear mean shift algorithm. Gaussian kernel is used with the bandwidth h as of
G(x_i, x_j) \propto \exp ≤ft( - \frac{ρ^2 (x_i,x_j)}{h^2} \right)
where ρ(x,y) is geodesic distance between two points x,y\in\mathcal{M}.
Numerically, some of the limiting points that collapse into the same cluster are
not exact. For such purpose, we require maxk parameter to search the
optimal number of clusters based on k-medoids clustering algorithm
in conjunction with silhouette criterion.
riem.nmshift(riemobj, h = 1, maxk = 5, maxiter = 50, eps = 1e-05)
riemobj |
a S3 |
h |
bandwidth parameter. The larger the h is, the more blurring is applied. |
maxk |
maximum number of clusters to determine the optimal number of clusters. |
maxiter |
maximum number of iterations to be run. |
eps |
tolerance level for stopping criterion. |
a named list containing
an (N\times N) distance between modes corresponding to each data point.
a length-N vector of class labels.
subbarao_nonlinear_2009Riemann
#-------------------------------------------------------------------
# Example on Sphere : a dataset with three types
#
# class 1 : 10 perturbed data points near (1,0,0) on S^2 in R^3
# class 2 : 10 perturbed data points near (0,1,0) on S^2 in R^3
# class 3 : 10 perturbed data points near (0,0,1) on S^2 in R^3
#-------------------------------------------------------------------
## GENERATE DATA
set.seed(496)
ndata = 10
mydata = list()
for (i in 1:ndata){
tgt = c(1, stats::rnorm(2, sd=0.1))
mydata[[i]] = tgt/sqrt(sum(tgt^2))
}
for (i in (ndata+1):(2*ndata)){
tgt = c(rnorm(1,sd=0.1),1,rnorm(1,sd=0.1))
mydata[[i]] = tgt/sqrt(sum(tgt^2))
}
for (i in ((2*ndata)+1):(3*ndata)){
tgt = c(stats::rnorm(2, sd=0.1), 1)
mydata[[i]] = tgt/sqrt(sum(tgt^2))
}
myriem = wrap.sphere(mydata)
mylabs = rep(c(1,2,3), each=ndata)
## RUN NONLINEAR MEANSHIFT FOR DIFFERENT 'h' VALUES
run1 = riem.nmshift(myriem, maxk=10, h=0.1)
run2 = riem.nmshift(myriem, maxk=10, h=1)
run3 = riem.nmshift(myriem, maxk=10, h=10)
## MDS FOR VISUALIZATION
mds2d = riem.mds(myriem, ndim=2)$embed
## VISUALIZE
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,3), pty="s")
plot(mds2d, pch=19, main="label : h=0.1", col=run1$cluster)
plot(mds2d, pch=19, main="label : h=1", col=run2$cluster)
plot(mds2d, pch=19, main="label : h=10", col=run3$cluster)
image(run1$distance[,30:1], axes=FALSE, main="distance : h=0.1")
image(run2$distance[,30:1], axes=FALSE, main="distance : h=1")
image(run3$distance[,30:1], axes=FALSE, main="distance : h=10")
par(opar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.