clust: Hierarchical Clustering

View source: R/clust.R

clustR Documentation

Hierarchical Clustering

Description

Hierarchical cluster analysis of objects.

Usage

clust(object, distMethod = "Euclidean", clustMethod = "UPGMA", binaryChs = NULL,
              nominalChs = NULL, ordinalChs = NULL)

Arguments

object

an object of class morphodata.

distMethod

the distance measure to be used. This must be one of: "Euclidean" (default), "Manhattan", "Minkowski", "Jaccard", "simpleMatching", or "Gower". See details.

clustMethod

the agglomeration method to be used: "average" (= "UPGMA"; default), "complete", "ward.D" (= "Ward"), "ward.D2", "single", "Mcquitty" (= "WPGMA"), "median" (= "WPGMC") or "centroid" (= "UPGMC"). See hclust for details.

binaryChs, nominalChs, ordinalChs

names of categorical ordinal, categorical nominal (multistate), and binary characters. Needed for Gower's dissimilarity coefficient only, see details.

Details

This function performs agglomerative hierarchical clustering. Typically, populations are used as OTUs (operational taxonomic units). Characters are standardised to a zero mean and a unit standard deviation.

Various measures of distance between the observations (rows) are applicable: (1) coefficients of distance for quantitative and binary characters: "Euclidean", "Manhattan", "Minkowski"; (2) similarity coefficients for binary characters: "Jaccard" and simple matching ("simpleMatching"); (3) coefficient for mixed data: "Gower". Note that the other than default methods for clustering and distance measurement are rarely used in morphometric analyses.

The Gower's dissimilarity coefficient can handle different types of variables. Characters have to be divided into four categories: (1) quantitative characters, (2) categorical ordinal characters, (3) categorical nominal (multistate) characters, and (4) binary characters. All characters are considered to be quantitative characters unless otherwise specified. Other types of characters have to be explicitly specified. To mark characters as ordinal, nominal, or binary, enumerate them by names using ordinalChs, nominalChs, and binaryChs arguments, respectively.

Value

An object of class 'hclust'. It encodes a stepwise dendrogram.

Examples

data(centaurea)

clustering.UPGMA = clust(centaurea)

plot(clustering.UPGMA, cex = 0.6, frame.plot = TRUE, hang = -1,
        main = "", sub = "", xlab = "", ylab = "distance")


# using Gower's method
data = list(
    ID = as.factor(c("id1","id2","id3","id4","id5","id6")),
    Population = as.factor(c("Pop1", "Pop1", "Pop2", "Pop2", "Pop3", "Pop3")),
    Taxon = as.factor(c("TaxA", "TaxA", "TaxA", "TaxB", "TaxB", "TaxB")),
    data = data.frame(
     stemBranching = c(1, 1, 1, 0, 0, 0),  # binaryChs
     petalColour = c(1, 1, 2, 3, 3, 3),  # nominalChs; 1=white, 2=red, 3=blue
     leaves = c(1,1,1,2,2,3), # nominalChs; 1=simple, 2=palmately compound, 3=pinnately compound
     taste = c(2, 2, 2, 3, 1, 1),   # ordinal; 1=hot, 2=hotter, 3=hottest
     stemHeight = c(10, 11, 14, 22, 23, 21),         # quantitative
     leafLength = c(8, 7.1, 9.4, 1.2, 2.3, 2.1)  )   # quantitative
)
attr(data, "class") = "morphodata"

clustering.GOWER = clust(data, distMethod = "Gower", clustMethod = "UPGMA",
                               binaryChs = c("stemBranching"),
                               nominalChs = c("petalColour", "leaves"),
                               ordinalChs = c("taste"))

plot(clustering.GOWER, cex = 0.6, frame.plot = TRUE, hang = -1,
        main = "", sub = "", xlab = "", ylab = "distance")


MorphoTools2 documentation built on March 7, 2023, 6:18 p.m.