Dendrogram representation of acomp or rcomp objects

Share:

Description

Function for plotting CoDa-dendrograms of acomp or rcomp objects.

Usage

1
2
3
4
5
6
CoDaDendrogram(X, V = NULL, expr=NULL, mergetree = NULL, signary = NULL, 
    range = c(-4,4), ..., xlim = NULL, ylim = NULL, yaxt = NULL, box.pos = 0,
    box.space = 0.25, col.tree = "black", lty.tree = 1, lwd.tree = 1,
    col.leaf = "black", lty.leaf = 1, lwd.leaf = 1, add = FALSE,border=NULL,
    type = "boxplot")
          

Arguments

X

data set to plot (an rcomp or acomp object)

V

basis to use, described as an ilr matrix

expr

a formula describing the partition basis, as with balanceBase

mergetree

basis to use, described as a merging tree (as in hclust)

signary

basis to use, described as a sign matrix (as in the example below)

range

minimum and maximum value for all coordinates (horizontal axes)

...

further parameters to pass to any function, be it a plotting function or one related to the "type" parameter below; likely to produce lots of warnings

xlim

minimum and maximum values for the horizontal direction of the plot (related to number of parts)

ylim

minimum and maximum values for the vertical direction of the plot (related to variance of coordinates)

yaxt

axis type for the vertical direction of the plot (see par)

box.pos

if type="boxplot", this is the relative position of the box in the vertical direction: 0 means centered on the axis, -1 aligned below the axis and +1 aligned above the axis

box.space

if type="boxplot", size of the box in the vertical direction as a portion of the minimal variance of the coordinates

col.tree

color for the horizontal axes

lty.tree

line type for the horizontal axes

lwd.tree

line width for the horizontal axes

col.leaf

color for the vertical conections between an axis and a part (leaf)

lty.leaf

line type for the leaves

lwd.leaf

line width for the leaves

add

should a new plot be triggered, or is the material to be added to an existing CoDa-dendrogram?

border

the color for drawing the rectangles

type

what to represent? one of "boxplot","density","histogram","lines","nothing" or "points", or an univocal abbreviation

Details

The object and an isometric basis are represented in a CoDa-dendrogram, as defined by Egozcue and Pawlowsky-Glahn (2005). This is a representation of the following elements:

  • aa hierarchical partition (which can be specified either through an ilrBase matrix (see ilrBase), a merging tree structure (see hclust) or a signary matrix (see gsi.merge2signary))

  • bthe sample mean of each coordinate of the ilr basis associated to that partition

  • cthe sample variance of each coordinate of the ilr basis associated to that partition

  • doptionally (potentially!), any graphical representation of each coordinate, as long as this representation is suitable for a univariate data set (box-plot, histogram, dispersion and kernel density are programmed or intended to, but any other may be added with little work).

Each coordinate is represented in a horizontal axis, which limits correspond to the values given in the parameter range. The vertical bar going up from each one of these coordinate axes represent the variance of that specific coordinate, and the contact point the coordinate mean. Note that to be able to represent an initial dendrogram, the first call to this function must be given a full data set, as means and variances must be computed. This information is afterwards stored in a global list, to add any sort of new material to all coordinates.
The default option is type="boxplot", which produces a box-plot for each coordinate, customizable using box.pos and box.space, as well as typical par parameters (col, border, lty, lwd, etc.). To obtain only the first three aspects, the function must be called with type="lines". As extensions, one might represent a single datum/few data (e.g., a mean or a random subsample of the data set) calling the function with add=TRUE and type="points". Other options (calling functions histogram or density, and admitting their parameters) will be also soon available.
Note that the original coda-dendrogram as defined by Egozcue and Pawlowsky-Glahn (2005) works with acomp objects and ilr bases. Functionality is extended to rcomp objects using calls to idt.

Author(s)

Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de

References

Egozcue J.J., V. Pawlowsky-Glahn, G. Mateu-Figueras and C. Barcel'o-Vidal (2003) Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3) 279-300

Egozcue, J.J. and V. Pawlowsky-Glahn (2005). CoDa-Dendrogram: a new exploratory tool. In: Mateu-Figueras, G. and Barcel\'o-Vidal, C. (Eds.) Proceedings of the 2nd International Workshop on Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05

See Also

ilrBase,balanceBase, rcomp, acomp,

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# first example: take the data set from the example, select only
# compositional parts
data(Hydrochem)
x = acomp(Hydrochem[,-c(1:5)])
gr = Hydrochem[,4] # river groups (useful afterwards)
# use an ilr basis coming from a clustering of parts
dd = dist(t(clr(x)))
hc1 = hclust(dd,method="ward")
plot(hc1)
mergetree=hc1$merge
CoDaDendrogram(X=acomp(x),mergetree=mergetree,col="red",range=c(-8,8),box.space=1)
# add the mean of each river
color=c("green3","red","blue","darkviolet")
aux = clrInv(t(sapply(split(clr(x),gr),mean)))
CoDaDendrogram(X=aux,add=TRUE,col=color,type="points",pch=4)

# second example: box-plots by rivers (filled)
CoDaDendrogram(X=acomp(x),mergetree=mergetree,col="black",range=c(-8,8),type="l")
xsplit = split(x,gr)
for(i in 1:4){
 CoDaDendrogram(X=acomp(xsplit[[i]]),col=color[i],type="box",box.pos=i-2.5,box.space=0.5,add=TRUE)
}

# third example: fewer parts, partition defined by a signary, and empty box-plots
x = acomp(Hydrochem[,c("Na","K","Mg","Ca","Sr","Ba","NH4")])
signary = t(matrix(  c(1,   1,   1,  1,   1,   1,  -1,
                       1,   1,  -1, -1,  -1,  -1,   0,
                       1,  -1,   0,  0,   0,   0,   0,
                       0,   0,  -1,  1,  -1,  -1,   0,
                       0,   0,   1,  0,  -1,   1,   0,
                       0,   0,   1,  0,   0,  -1,   0),ncol=7,nrow=6,byrow=TRUE))

CoDaDendrogram(X=acomp(x),signary=signary,col="black",range=c(-8,8),type="l")
xsplit = split(x,gr)
for(i in 1:4){
  CoDaDendrogram(X=acomp(xsplit[[i]]),border=color[i],
       type="box",box.pos=i-2.5,box.space=1.5,add=TRUE)
  CoDaDendrogram(X=acomp(xsplit[[i]]),col=color[i],
       type="line",add=TRUE)
}

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.