In this vignette we go through how to use this package. Nowadays clustering becomes more and more important in data science area. Lots of applications such as finding patterns, exploratory data analysis and grouping data together make use of clustering algorithms. People have developed a variety of clustering techniques such as k-means, hierarchical clustering, dbscan, etc to cluster the data. However, different algorithms produce different clustering results, and it's very hard to distinguish which it's better. Also, when you produce clustering results from different algorithms, there is also a problem to combine all these clustering results together. TREC stands for tree reduced ensemble clustering, which is an effective algorithm to combine your clustering results together and produce one final clustering.
Here is a sample code for how to use TREC:
library(TREC) # randomly generate our data data<-matrix(rnorm(100),nrow=50) # producing clustering result cl1,cl2,cl3 # you may use as many clustering algorithms as you want cl1<-hclust(dist(data),method='single') cl2<-kmeans(data,centers=3) library(dbscan) cl3<-dbscan(data,eps = .2) # call combineClusterings function final_cluster_result <- combineClusterings(cl1,cl2,cl3) plot(final_cluster_result)
From sample code, we can see there are four steps to use TREC:
load/generate your data
use clustering algorithms to produce clustering result
use function asBranchComponent to transform clustering result into matrix
use function to cbind matrix result and apply function TRECgetComponentsfromClustering
It's necessary to follow above 4 steps to use TREC. Final cluster result is a matrix, which represent a hierarchical clustering result. Each row represents a data point, and each column represents a layer of clustering result. Each number represents the cluster id that data point is assigned to.
library(TREC) # randomly generate our data data<-matrix(rnorm(150),nrow=50) # producing clustering result cl1,cl2,cl3 # you may use as many clustering algorithms as you want cl1<-hclust(dist(data),method='complete') cl2<-kmeans(data,centers=5) library(mclust) cl3<-Mclust(data,eps = .2) # call TREC function final_cluster_result <- combineClusterings(cl1,cl2,cl3) plot(final_cluster_result)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.