jarvisPatrick | R Documentation |
Function to perform Jarvis-Patrick clustering. The algorithm requires a
nearest neighbor table, which consists of neighbors for each item
in the dataset. This information is then used to join items into clusters
with the following requirements:
(a) they are contained in each other's neighbor list
(b) they share at least 'k' nearest neighbors
The nearest neighbor table can be computed with nearestNeighbors
.
For standard Jarvis-Patrick clustering, this function takes the number of neighbors to keep for each item.
It also has the option of passing a cutoff similarity value instead of the number of neighbors. In this mode, all
neighbors which meet the cutoff criteria will be included in the table.
This is a setting that is not part of the original Jarvis-Patrick algorithm. It
allows to generate tighter clusters and to minimize some limitations of this method, such as joining completely
unrelated items when clustering small data sets. Other extensions, such as the linkage
parameter, can also
help improve the clustering quality.
jarvisPatrick(nnm, k, mode="a1a2b", linkage="single")
nnm |
A nearest neighbor table, as produced by |
k |
Minimum number of nearest neighbors two rows (items) in the nearest neighbor table need to have in common to join them into the same cluster. |
mode |
If |
linkage |
Can be one of "single", "average", or "complete", for single linkage, average linkage and complete linkage merge requirements, respectively. In the context of Jarvis-Patrick, average linkage means that at least half of the pairs between the clusters under consideration must pass the merge requirement. Similarly, for complete linkage, all pairs must pass the merge requirement. Single linkage is the normal case for Jarvis-Patrick and just means that at least one pair must meet the requirement. |
...
Depending on the setting under the type
argument, the function returns the clustering result in a
named vector
or a nearest neighbor table as matrix
.
...
Thomas Girke
Jarvis RA, Patrick EA (1973) Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Transactions on Computers, C22, 1025-1034. URLs: http://davide.eynard.it/teaching/2012_PAMI/JP.pdf, http://www.btluke.com/jpclust.html, http://www.daylight.com/dayhtml/doc/cluster/index.pdf
Functions: cmp.cluster
trimNeighbors
nearestNeighbors
## Load/create sample APset and FPset
data(apset)
fpset <- desc2fp(apset)
## Standard Jarvis-Patrick clustering on APset/FPset objects
jarvisPatrick(nearestNeighbors(apset,numNbrs=6), k=5, mode="a1a2b")
jarvisPatrick(nearestNeighbors(fpset,numNbrs=6), k=5, mode="a1a2b")
## Jarvis-Patrick clustering only with requirement (b)
jarvisPatrick(nearestNeighbors(fpset,numNbrs=6), k=5, mode="b")
## Modified Jarvis-Patrick clustering with minimum similarity 'cutoff'
## value (here Tanimoto coefficient)
jarvisPatrick(nearestNeighbors(fpset,cutoff=0.6, method="Tanimoto"), k=2 )
## Output nearest neighbor table (matrix)
nnm <- nearestNeighbors(fpset,numNbrs=6)
## Perform clustering on precomputed nearest neighbor table
jarvisPatrick(nnm, k=5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.