clustCoDa: Cluster analysis for compositional data

View source: R/clustCoDa.R

clustCoDaR Documentation

Cluster analysis for compositional data

Description

Clustering in orthonormal coordinates or by using the Aitchison distance

Usage

clustCoDa(
  x,
  k = NULL,
  method = "Mclust",
  scale = "robust",
  transformation = "pivotCoord",
  distMethod = NULL,
  iter.max = 100,
  vals = TRUE,
  alt = NULL,
  bic = NULL,
  verbose = TRUE
)

## S3 method for class 'clustCoDa'
plot(
  x,
  y,
  ...,
  normalized = FALSE,
  which.plot = "clusterMeans",
  measure = "silwidths"
)

Arguments

x

compositional data represented as a data.frame

k

number of clusters

method

clustering method. One of Mclust, cmeans, kmeansHartigan, cmeansUfcl, pam, clara, fanny, ward.D2, single, hclustComplete, average, mcquitty, median, centroid

scale

if orthonormal coordinates should be normalized.

transformation

default are the isometric logratio coordinates. Can only used when distMethod is not Aitchison.

distMethod

Distance measure to be used. If “Aitchison”, then transformation should be “identity”.

iter.max

parameter if kmeans is chosen. The maximum number of iterations allowed

vals

if cluster validity measures should be calculated

alt

a known partitioning can be provided (for special cluster validity measures)

bic

if TRUE then the BIC criteria is evaluated for each single cluster as validity measure

verbose

if TRUE additional print output is provided

y

the y coordinates of points in the plot, optional if x is an appropriate structure.

...

additional parameters for print method passed through

normalized

results gets normalized before plotting. Normalization is done by z-transformation applied on each variable.

which.plot

currently the only plot. Plot of cluster centers.

measure

cluster validity measure to be considered for which.plot equals “partMeans”

Details

The compositional data set is either internally represented by orthonormal coordiantes before a cluster algorithm is applied, or - depending on the choice of parameters - the Aitchison distance is used.

Value

all relevant information such as cluster centers, cluster memberships, and cluster statistics.

Author(s)

Matthias Templ (accessing the basic features of hclust, Mclust, kmeans, etc. that are all written by others)

References

M. Templ, P. Filzmoser, C. Reimann. Cluster analysis applied to regional geochemical data: Problems and possibilities. Applied Geochemistry, 23 (8), 2198–2213, 2008

Templ, M., Filzmoser, P., Reimann, C. (2008) Cluster analysis applied to regional geochemical data: Problems and possibilities, Applied Geochemistry, 23 (2008), pages 2198 - 2213.

Examples

data(expenditures)
x <- expenditures
rr <- clustCoDa(x, k=6, scale = "robust", transformation = "pivotCoord")
rr2 <- clustCoDa(x, k=6, distMethod = "Aitchison", scale = "none", 
                 transformation = "identity")
rr3 <- clustCoDa(x, k=6, distMethod = "Aitchison", method = "single",
                 transformation = "identity", scale = "none")
                 
## Not run: 
require(reshape2)
plot(rr)
plot(rr, normalized = TRUE)
plot(rr, normalized = TRUE, which.plot = "partMeans")

## End(Not run)

robCompositions documentation built on Aug. 25, 2023, 5:13 p.m.