A class providing the means to analyse compositions in the philosophical framework of the Aitchison Simplex.
1 2 3
composition or dataset of compositions
vector containing the indices xor names of the columns to be used
the total amount to be used, typically 1 or 100
should the user be warned in case of NA,NaN or 0 coding different types of missing values?
a number, vector or matrix of positive numbers giving the detection limit of all values, all columns or each value, respectively
the code for 'Below Detection Limit' in X
the code for 'Structural Zero' in X
the code for 'Missing At Random' in X
the code for 'Missing Not At Random' in X
Many multivariate datasets essentially describe amounts of D different
parts in a whole. This has some important implications justifying to
regard them as a scale for its own, called a
composition. This scale was in-depth analysed by Aitchison
(1986) and the functions around the class
"acomp" follow his
Compositions have some important properties: Amounts are always positive. The amount of every part is limited to the whole. The absolute amount of the whole is noninformative since it is typically due to artifacts on the measurement procedure. Thus only relative changes are relevant. If the relative amount of one part increases, the amounts of other parts must decrease, introducing spurious anticorrelation (Chayes 1960), when analysed directly. Often parts (e.g H2O, Si) are missing in the dataset leaving the total amount unreported and longing for analysis procedures avoiding spurious effects when applied to such subcompositions. Furthermore, the result of an analysis should be indepent of the units (ppm, g/l, vol.%, mass.%, molar fraction) of the dataset.
From these properties Aitchison showed that the analysis should be based on ratios or log-ratios only. He introduced several transformations (e.g.
and a distance (
dist) which are compatible
properties. Later it was found that the set of compostions equipped with
perturbation as addition and power-transform as scalar multiplication
dist as distance form a D-1 dimensional
euclidean vector space (Billheimer, Fagan and Guttorp, 2001), which
can be mapped isometrically to a usual real vector space by
(Pawlowsky-Glahn and Egozcue, 2001).
The general approach in analysing acomp objects is thus to perform classical multivariate analysis on clr/alr/ilr-transformed coordinates and to backtransform or display the results in such a way that they can be interpreted in terms of the original compositional parts.
A side effect of the procedure is to force the compositions to sum up to a total, which is done by the closure operation
a vector of class
"acomp" representing one closed composition
or a matrix of class
multiple closed compositions each in one row.
The policy of treatment of zeroes, missing values and values below detecion limit is explained in depth in compositions.missing.
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
Aitchison, J. (1986) The Statistical Analysis of Compositional
Data Monographs on Statistics and Applied Probability. Chapman &
Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel'o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn
(2002) A consise guide to the algebraic geometric structure of the
simplex, the sample space for compositional data analysis, Terra
Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,
Journal of the American Statistical Association, 96 (456), 1205-1214
Chayes, F. (1960). On correlation between variables of constant sum. Journal of Geophysical Research 65~(12), 4185–4193.
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to
statistical analysis on the simplex. SERRA 15(5), 384-398
Pawlowsky-Glahn, V. (2003) Statistical modelling on coordinates. In:
Thi\'o-Henestrosa, S. and Mart\'in-Fern\'andez, J.A. (Eds.)
Proceedings of the 1st International Workshop on Compositional Data Analysis,
Universitat de Girona, ISBN 84-8458-111-X, http://ima.udg.es/Activitats/CoDaWork03
Mateu-Figueras, G. and Barcel\'o-Vidal, C. (Eds.)
Proceedings of the 2nd International Workshop on Compositional Data Analysis,
Universitat de Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to analyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.