WH_hclust: Hierarchical clustering of histogram data
In HistDAWass: Histogram-Valued Data Analysis

View source: R/unsuperv_classification.R

WH_hclust

R Documentation

Hierarchical clustering of histogram data

Description

The function implements a Hierarchical clustering for a set of histogram-valued data, based on the L2 Wassertein distance. Extends the hclust function of the stat package.

Usage

WH_hclust(
  x,
  simplify = FALSE,
  qua = 10,
  standardize = FALSE,
  distance = "WDIST",
  method = "complete"
)

Arguments

`x`	A MatH object (a matrix of distributionH).
`simplify`	A logic value (default is FALSE), if TRUE histograms are recomputed in order to speed-up the algorithm.
`qua`	An integer, if `simplify`=TRUE is the number of quantiles used for recodify the histograms.
`standardize`	A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable, using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one.
`distance`	A string default "WDIST" the L2 Wasserstein distance (other distances will be implemented)
`method`	A string, default="complete", is the the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "`ward.D`", "`ward.D2`", "`single`", "`complete`", "`average`" (= UPGMA), "`mcquitty`" (= WPGMA), "`median`" (= WPGMC) or "`centroid`" (= UPGMC).

Value

An object of class hclust which describes the tree produced by the clustering process.

References

Irpino A., Verde R. (2006). A new Wasserstein based distance for the hierarchical clustering of histogram symbolic data. In: Batanjeli et al. Data Science and Classification, IFCS 2006. p. 185-192, BERLIN:Springer, ISBN: 3-540-34415-2

Examples

results <- WH_hclust(x = BLOOD, simplify = TRUE, method = "complete")
plot(results) # it plots the dendrogram
cutree(results, k = 5) # it returns the labels for 5 clusters

HistDAWass documentation built on May 29, 2024, 1:27 a.m.