knitr::opts_chunk$set(echo = TRUE)
Thanks for checking out my opensource package weightedClustSuite, developed by myself.
This package was created to easily implement weighted point density clustering and validation. In my package, one can easily give their datapoints weights and then see the results shown and validated with multiple unsupervised clustering algorithms. This is all done through a single clustering object making the data, ex. different clustering assignments, very easy to access.
The package operates through my implementation of manipulating the gaussian density estimates and distance matrices with the weights, prior to the bulk of each clustering algorithm. This performs more EFFICIENTLY and can allow for much LARGER weighted datasets than the brute force alternative of manually adding each points weight to the dataset.
#Run this command to install the package on your machine #devtools::install_github("dhanujg/weightedClustSuite")
We will use the package on a dataset currently under research. This dataset represents a 2 species trait values and the species abundance for each species. The data is presented in 3 columns, 2 of which are the species trait values and the third are the WEIGHTS (species abundance)
# Read in the Data Colnames = c('X','Y','Weight') Species_Data <- read.csv(file.choose(), header = FALSE) #Create Input Matrices for the package Trait_Species <- as.vector(Species_Data[,1:2]) Weight_Species <- as.vector(Species_Data[,3]) Dist_Species <- dist(Species_Data[,1:2]) Trait_Species[1:5,] Weight_Species[1:5]
library(weightedClustSuite) #feed in the input data to create the base clustering object with ClustObj function SpeciesClust <- ClustObj(Trait_Species, Weight_Species, Dist_Species, gaussian=TRUE, verbose = TRUE)
We can now update the clustering object (SpeciesClust) with data from a clustering algorithm, here we will run K mediods and Spectral to get a quick visualization of the data.
#SpeciesClust <- findDensityPeakClusters(SpeciesClust, rho=4.01, delta=0.21) SpeciesClust <- findKMediodsClusters(SpeciesClust, neighbors = 5 ) SpeciesClust <- findSpectralClusters(SpeciesClust, neighbors = 5 ) SpeciesClust <- findDBSCANClusters(SpeciesClust, eps = 0.05)
We can also create plots of each of the clustering assignments WITH the datapoint sizes changing corresponding to the weights
KMediods_2D <- plot2DCluster(SpeciesClust, SpeciesClust$clustersKmediods, type = 'Kmediods') Spectral_2D <- plot2DCluster(SpeciesClust, SpeciesClust$clustersSpectral, type = 'Spectral')
clustering assignments are also stored in the class object and can be viewed in examples like
SpeciesClust$clustersKmediods[1:10]
Create a Validation chart to test various rho/delta combinations and obtain ideal classification score
SpeciesChart <- findDensityPeakCluster_validationChart(SpeciesClust, DBCV = FALSE, status = FALSE) View(SpeciesChart[1:5,])
We can also visualize the density peaks by rho/delta values in this plot
plotDensityPeak(SpeciesClust)
from the validation chart, we have chosen this specific rho/delta combination to give us 5 clusters with no unclassified points, we can then run the Density Peak clustering
SpeciesClust <- findDensityPeakClusters(SpeciesClust, rho=4.01, delta=0.21, verbose = FALSE)
Now we can plot our clustering assignments through a 2D Graph, a MDS graph (for higher dimensions), and a TSNE Graph. The cross marks indicate the cluster centers and the size of the points correspond to the weights.
DensityPeak_2D = plotDensityPeak2D(SpeciesClust) DensityPeak_MDS = plotMDS(SpeciesClust) DensityPeak_TSNE = plotTSNE(SpeciesClust)
We can compare the results of our Density Peak Clustering, with DBSCAN clustering utilizing the same weighted distance cutoff for epsilon.
DensityPeak_2D = plotDensityPeak2D(SpeciesClust) DBSCAN_2D = plot2DCluster(SpeciesClust, SpeciesClust$clustersDBSCAN, type = 'DBSCAN')
There is a lot more to be added to the package and still currently being edited for implementation. Please email me at dhanujg@umich.edu if you have any questions.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.