HierarchicalDBSCAN: Hierarchical DBSCAN

HierarchicalDBSCANR Documentation

Hierarchical DBSCAN

Description

Hierarchical DBSCAN clustering [Campello et al., 2015].

Usage

HierarchicalDBSCAN(DataOrDistances,minPts=4,

PlotTree=FALSE,PlotIt=FALSE,...)

Arguments

DataOrDistances

Either a [1:n,1:d] matrix of dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features.

or a [1:n,1:n] symmetric distance matrix.

minPts

Classic smoothing factor in density estimates [Campello et al., 2015, p.9]

PlotIt

Default: FALSE, If TRUE plots the first three dimensions of the dataset with colored three-dimensional data points defined by the clustering stored in Cls

PlotTree

Default: FALSE, If TRUE plots the dendrogram. If minPts is missing, PlotTree is set to TRUE.

...

Further arguments to be set for the clustering algorithm, if not set, default arguments are used.

Details

"Computes the hierarchical cluster tree representing density estimates along with the stability-based flat cluster extraction proposed by Campello et al. (2013). HDBSCAN essentially computes the hierarchy of all DBSCAN* clusterings, and then uses a stability-based extraction method to find optimal cuts in the hierarchy, thus producing a flat solution."[Hahsler et al., 2019]

It is claimed by the inventors that the minPts parameter is noncritical [Campello et al., 2015, p.35]. minPts is reported to be set to 4 on all experiments [Campello et al., 2015, p.35].

Value

List of

Cls

[1:n] numerical vector defining the clustering; this classification is the main output of the algorithm. Points which cannot be assigned to a cluster will be reported as members of the noise cluster with 0.

Dendrogram

Dendrogram of hierarchical clustering algorithm

Tree

Ultrametric tree of hierarchical clustering algorithm

Object

Object defined by clustering algorithm as the other output of this algorithm

Author(s)

Michael Thrun

References

[Campello et al., 2015] Campello, R. J., Moulavi, D., Zimek, A., & Sander, J.: Hierarchical density estimates for data clustering, visualization, and outlier detection, ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 10(1), pp. 1-51. 2015.

[Hahsler et al., 2019] Hahsler M, Piekenbrock M, Doran D: dbscan: Fast Density-Based Clustering with R. Journal of Statistical Software, 91(1), pp. 1-30. doi: 10.18637/jss.v091.i01, 2019

Examples

data('Hepta')

out=HierarchicalDBSCAN(Hepta$Data,PlotIt=FALSE)


data('Leukemia')
set.seed(1234)
CA=HierarchicalDBSCAN(Leukemia$DistanceMatrix)
#ClusterCount(CA$Cls)
#ClusterDendrogram(CA$Dendrogram,5,main='H-DBscan')



FCPS documentation built on Oct. 19, 2023, 5:06 p.m.