apcluster | R Documentation |
Runs affinity propagation clustering
## S4 method for signature 'matrix,missing'
apcluster(s, x, p=NA, q=NA, maxits=1000,
convits=100, lam=0.9, includeSim=FALSE, details=FALSE,
nonoise=FALSE, seed=NA)
## S4 method for signature 'dgTMatrix,missing'
apcluster(s, x, p=NA, q=NA, maxits=1000,
convits=100, lam=0.9, includeSim=FALSE, details=FALSE,
nonoise=FALSE, seed=NA)
## S4 method for signature 'sparseMatrix,missing'
apcluster(s, x, ...)
## S4 method for signature 'Matrix,missing'
apcluster(s, x, ...)
## S4 method for signature 'character,ANY'
apcluster(s, x, p=NA, q=NA, maxits=1000,
convits=100, lam=0.9, includeSim=TRUE, details=FALSE,
nonoise=FALSE, seed=NA, ...)
## S4 method for signature 'function,ANY'
apcluster(s, x, p=NA, q=NA, maxits=1000,
convits=100, lam=0.9, includeSim=TRUE, details=FALSE,
nonoise=FALSE, seed=NA, ...)
s |
an |
x |
input data to be clustered; if |
p |
input preference; can be a vector that specifies
individual preferences for each data point. If scalar,
the same value is used for all data points. If |
q |
if |
maxits |
maximal number of iterations that should be executed |
convits |
the algorithm terminates if the examplars have not
changed for |
lam |
damping factor; should be a value in the range [0.5, 1); higher values correspond to heavy damping which may be needed if oscillations occur |
includeSim |
if |
details |
if |
nonoise |
|
seed |
for reproducibility, the seed of the random number
generator can be set to a fixed value before
adding noise (see above), if |
... |
for the methods with signatures |
Affinity Propagation clusters data using a set of real-valued pairwise data point similarities as input. Each cluster is represented by a cluster center data point (the so-called exemplar). The method is iterative and searches for clusters maximizing an objective function called net similarity.
When called with a similarity matrix as input (which may also be a
sparse matrix according to the Matrix package), the function performs
AP clustering. When called with the name of a package-provided
similarity function or a user-provided similarity function object and
input data, the function first computes the similarity matrix before
performing AP clustering. The similarity
matrix is returned for later use as part of the
APResult
object depending on whether includeSim
was set to TRUE
(see
argument description above).
Apart from minor adaptations and optimizations, the AP
clustering functionality of the function apcluster
is
largely analogous to Frey's and Dueck's Matlab code
(see https://psi.toronto.edu/research/affinity-propagation-clustering-by-message-passing/).
The new argument q
allows for better controlling the number of
clusters without knowing the distribution of similarity
values. A meaningful range for the parameter p
can be determined
using the function preferenceRange
. Alternatively, a
certain fixed number of clusters may be desirable. For this purpose,
the function apclusterK
is available.
Upon successful completion, the function returns an
APResult
object.
Ulrich Bodenhofer, Andreas Kothmeier, Johannes Palme, and Chrats Melkonian
https://github.com/UBod/apcluster
Frey, B. J. and Dueck, D. (2007) Clustering by passing messages between data points. Science 315, 972-976. DOI: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1126/science.1136800")}.
Bodenhofer, U., Kothmeier, A., and Hochreiter, S. (2011) APCluster: an R package for affinity propagation clustering. Bioinformatics 27, 2463-2464. DOI: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btr406")}.
APResult
, show-methods
,
plot-methods
, labels-methods
,
preferenceRange
, apclusterL-methods
,
apclusterK
## create two Gaussian clouds
cl1 <- cbind(rnorm(100, 0.2, 0.05), rnorm(100, 0.8, 0.06))
cl2 <- cbind(rnorm(50, 0.7, 0.08), rnorm(50, 0.3, 0.05))
x <- rbind(cl1, cl2)
## compute similarity matrix and run affinity propagation
## (p defaults to median of similarity)
apres <- apcluster(negDistMat(r=2), x, details=TRUE)
## show details of clustering results
show(apres)
## plot clustering result
plot(apres, x)
## plot heatmap
heatmap(apres)
## run affinity propagation with default preference of 10% quantile
## of similarities; this should lead to a smaller number of clusters
## reuse similarity matrix from previous run
apres <- apcluster(s=apres@sim, q=0.1)
show(apres)
plot(apres, x)
## now try the same with RBF kernel
sim <- expSimMat(x, r=2)
apres <- apcluster(s=sim, q=0.2)
show(apres)
plot(apres, x)
## create sparse similarity matrix
cl1 <- cbind(rnorm(20, 0.2, 0.05), rnorm(20, 0.8, 0.06))
cl2 <- cbind(rnorm(20, 0.7, 0.08), rnorm(20, 0.3, 0.05))
x <- rbind(cl1, cl2)
sim <- negDistMat(x, r=2)
ssim <- as.SparseSimilarityMatrix(sim, lower=-0.2)
## run apcluster() on the sparse similarity matrix
apres <- apcluster(ssim, q=0)
apres
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.