supc1: Self-Updating Process Clustering
In supc: The Self-Updating Process Clustering Algorithms

Description Usage Arguments Details Value References Examples

The SUP is a distance-based method for clustering. The idea of this algorithm is similar to gravitational attraction: every sample gravitates towards one another. The algorithm mimics the process of gravitational attraction iteratively that eventually merges the samples into clusters on the sample space. During the iterations, all samples continue moving until the system becomes stable.

supc1(
  x,
  r = NULL,
  rp = NULL,
  t = c("static", "dynamic"),
  tolerance = 1e-04,
  cluster.tolerance = 10 * tolerance,
  drop = TRUE,
  implementation = c("cpp", "R", "cpp2"),
  sort = TRUE,
  verbose = (nrow(x) > 10000)
)

`x`	data matrix. Each row is an instance of the data.
`r`	numeric vector or `NULL`. The parameter r of the self-updating process.
`rp`	numeric vector or `NULL`. If `r` is `NULL`, then `rp` will be used. The corresponding `r` is the `rp`-percentile of the pairwise distances of the data. If both `r` and `rp` are `NULL`, then the default value is `rp = c(0.0005, 0.001, 0.01, 0.1, 0.3)`.
`t`	either numeric vector, list of function, or one of `"static" or "dynamic"`. The parameter T(t) of the self-updating process.
`tolerance`	numeric value. The threshold of convergence.
`cluster.tolerance`	numeric value. After iterations, if the distance of two points are smaller than `cluster.tolerance`, then they are identified as in the same cluster.
`drop`	logical value. Whether to delete the list structure if its length is 1.
`implementation`	eithor `"R"`, `"cpp"` or `"cpp2"`. Choose the engine to calculate result. The `"cpp2"` parallelly computes the distance in C++ with OpenMP, which is not supported under OS X, and uses the early-stop to speed up calculation.
`sort`	logical value. Whether to sort the cluster id by size.
`verbose`	logical value. Whether to show the iteration history.

Please check the vignettes via vignette("supc", package = "supc") for details.

supc1 returns a list of objects of class "supc".

Each "supc" object contains the following elements:

`x`	The input matrix.
`d0`	The pairwise distance matrix of `x` or `NULL`.
`r`	The value of r of the clustering.
`t`	The function T(t) of the clustering.
`cluster`	The cluster id of each instance.
`centers`	The center of each cluster.
`size`	The size of each cluster.
`iteration`	The number of iterations before convergence.
`result`	The position of data after iterations.

Shiu, Shang-Ying, and Ting-Li Chen. 2016. "On the Strengths of the Self-Updating Process Clustering Algorithm." Journal of Statistical Computation and Simulation 86 (5): 1010–1031. doi: 10.1080/00949655.2015.1049605.

set.seed(1)
X <- local({
 mu <- list(
   x = c(0, 2, 1, 6, 8, 7, 3, 5, 4),
   y = c(0, 0, 1, 0, 0, 1, 3, 3, 4)
 )
 X <- lapply(1:5, function(i) {
   cbind(rnorm(9, mu$x, 1/5), rnorm(9, mu$y, 1/5))
 })
 X <- do.call(rbind, X)
 n <- nrow(X)
 X <- rbind(X, matrix(0, 20, 2))
 k <- 1
 while(k <= 20) {
   tmp <- c(13*runif(1)-2.5, 8*runif(1)-2.5)
   y1 <- mu$x - tmp[1]
   y2 <- mu$y - tmp[2]
   y <- sqrt(y1^2+y2^2)
   if (min(y)> 2){
     X[k+n,] <- tmp
     k <- k+1
   }
 }
 X
})
X.supcs <- supc1(X, r = c(0.9, 1.7, 2.5), t = "dynamic", implementation = "R")
X.supcs$cluster
plot(X.supcs[[1]], type = "heatmap", major.size = 2)
plot(X.supcs[[2]], type = "heatmap", col = cm.colors(24), major.size = 5)

X.supcs <- supc1(X, r = c(1.7, 2.5), t = list(
 function(t) {1.7 / 20 + exp(t) * (1.7 / 50)},
 function(t) {exp(t)}
), implementation = "R")
plot(X.supcs[[1]], type = "heatmap", major.size = 2)
plot(X.supcs[[2]], type = "heatmap", col = cm.colors(24), major.size = 5)