ACLUST: Amalgamation clustering of the parts of a compositional data...
In easyCODA: Compositional Data Analysis in Practice

ACLUST

R Documentation

Amalgamation clustering of the parts of a compositional data matrix

Description

This function clusters the parts of a compositional data matrix, using amalgamation of the parts at each step.

Usage

ACLUST(data, weight = TRUE, close = TRUE)

Arguments

`data`	Compositional data matrix, with the parts as columns
`weight`	`TRUE` (default) for weighting using part averages of closed compositions, `FALSE` for unweighted analysis, or a vector of user-defined column weights
`close`	`TRUE` (default) will close the rows of `data` prior to clustering, `FALSE` leaves `data` as it is

Details

The function ACLUST performs amalgamation hierarchical clustering on the parts (columns) of a given compositional data matrix, as proposed by Greenacre (2019). At each step of the clustering two clusters are amalgamated that give the least loss of explained logratio variance.

Value

An object which describes the tree produced by the clustering process on the n objects. The object is a list with components:

`merge`	an n-1 by 2 matrix. Row i of `merge` describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in `merge` indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
`height`	a set of n-1 real values (non-decreasing for ultrametric trees). The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration.
`order`	a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix merge will not have crossings of the branches
`labels`	a vector of column labels, the column names of `data`

Author(s)

Michael Greenacre

References

Greenacre, M. (2018), Compositional Data Analysis in Practice, Chapman & Hall / CRC.
Greenacre, M. (2019), Amalgamations are valid in compositional data analysis, can be used in agglomerative clustering, and their logratios have an inverse transformation. Applied Computing and Geosciences, open access.

Examples

data(cups)

# amalgamation clustering    (weighted parts)
cups.aclust <- ACLUST(cups)
plot(cups.aclust)

# reproducing Figure 2(b) of Greenacre (2019) (unweighted parts))
# dataset Aar is in the compositions package
# aar is a subset of Aar
# code given here within the '\dontrun' environment since external package 'compositions' required
## Not run: 
  library(compositions)
  data(Aar)
  aar <- Aar[,c(3:12)]
  aar.aclust <- ACLUST(aar, weight=FALSE)
# the maximum height is the total variance
# convert to percents of variance NOT explained
  aar.aclust$height <- 100 * aar.aclust$height / max(aar.aclust$height)
  plot(aar.aclust, main="Parts of Unexplained Variance", ylab="Variance (percent)")

## End(Not run)

easyCODA documentation built on Aug. 26, 2024, 3 p.m.