View source: R/rf.unsupervised.R
rf.unsupervised | R Documentation |
Performs an unsupervised Random Forests for returning clustering, based on dissimilarity, and optional neighbor distance.
rf.unsupervised(
x,
n = 2,
proximity = FALSE,
silhouettes = FALSE,
clara = FALSE,
...
)
x |
A matrix/data/frame object to cluster |
n |
Number of clusters |
proximity |
(FALSE/TRUE) Return matrix of neighbor distances based on proximity |
silhouettes |
(FALSE/TRUE) Return adjusted silhouette values |
clara |
(FALSE/TRUE) Use clara partitioning, for large data |
... |
Additional Random Forests arguments |
Clusters (k) are derived using the random forests proximity matrix, treating it as dissimilarity neighbor distances. The clusters are identified using a Partitioning Around Medoids where negative silhouette values are assigned to the nearest neighbor.
A vector of clusters or list class object of class "unsupervised", containing the following components:
distances = [Scaled proximity matrix representing dissimilarity neighbor distances]
k = [Vector of cluster labels using adjusted silhouettes]
silhouette.values = [Adjusted silhouette cluster labels and silhouette values]
Jeffrey S. Evans <jeffrey_evans<at>tnc.org>
Rand, W.M. (1971) Objective Criteria for the Evaluation of Clustering Methods. Journal of the American Statistical Association, 66:846-850.
Shi, T., Seligson, D., Belldegrun, A.S., Palotie, A., and Horvath, Ss (2005) Tumor Classification by Tissue Microarray Profiling: Random Forest Clustering Applied to Renal Cell Carcinoma. Modern Pathology, 18:547-557.
randomForest
for ... options
pam
for details on Partitioning Around Medoids (PAM)
clara
for details on Clustering Large Applications (clara)
library(randomForest)
data(iris)
n = 4
clust.iris <- rf.unsupervised(iris[,1:4], n=n, proximity = TRUE,
silhouettes = TRUE)
clust.iris$k
mds <- stats:::cmdscale(clust.iris$distances, eig=TRUE, k=n)
colnames(mds$points) <- paste("Dim", 1:n)
mds.col <- ifelse(clust.iris$k == 1, rainbow(4)[1],
ifelse(clust.iris$k == 2, rainbow(4)[2],
ifelse(clust.iris$k == 3, rainbow(4)[3],
ifelse(clust.iris$k == 4, rainbow(4)[4], NA))))
plot(mds$points[,1:2],col=mds.col, pch=20)
pairs(mds$points, col=mds.col, pch=20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.