WH_hclust: Hierarchical clustering of histogram data

Description Usage Arguments Value References See Also Examples

View source: R/unsuperv_classification.R

Description

The function implements a Hierarchical clustering for a set of histogram-valued data, based on the L2 Wassertein distance. Extends the hclust function of the stat package.

Usage

1
2
WH_hclust(x, simplify = FALSE, qua = 10, standardize = FALSE,
  distance = "WDIST", method = "complete")

Arguments

x

A MatH object (a matrix of distributionH).

simplify

A logic value (default is FALSE), if TRUE histograms are recomputed in order to speed-up the algorithm.

qua

An integer, if simplify=TRUE is the number of quantiles used for recodify the histograms.

standardize

A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable, using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one.

distance

A string default "WDIST" the L2 Wasserstein distance (other distances will be implemented)

method

A string, default="complete", is the the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

Value

An object of class hclust which describes the tree produced by the clustering process.

References

Irpino A., Verde R. (2006). A new Wasserstein based distance for the hierarchical clustering of histogram symbolic data. In: Batanjeli et al. Data Science and Classification, IFCS 2006. p. 185-192, BERLIN:Springer, ISBN: 3-540-34415-2

See Also

hclust of stat package for further details.

Examples

1
2
3
results=WH_hclust(x = BLOOD,simplify = TRUE, method="complete")
plot(results) # it plots the dendrogram
cutree(results,k = 5) # it returns the labels for 5 clusters

HistDAWass documentation built on Dec. 7, 2017, 5:03 p.m.