knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
In this vignette we show how to define log-ratio coordinates using coda.base
package and its function coordinates
with parameters X
, a composition, and basis
, defining the independent log-contrasts for building the coordinates.
In this vignette we work with a subcomposition of the results obtained in different regions of Catalonia in 2017's parliament elections:
library(coda.base) data('parliament2017') X = parliament2017[,c('erc','jxcat','psc','cs')]
coda.base
The alr coordinates are accessible by setting the parameter basis='alr'
or by using the building function alr_basis()
.
If you don't want the last part in the denominator, the easiest way to define an alr-coordinates is to set basis='alr'
:
H1.alr = coordinates(X, basis = 'alr') head(H1.alr)
It defines an alr-coordinates were the last part is used in the denominator. We can obtain the basis used to build the coordinates with function basis()
:
basis(H1.alr)
The basis can be reproduced using the function alr_basis
:
alr_basis(dim = 4)
In fact, function alr_basis
allows to define any type of alr-like coordinate by defining the numerator and the denominator:
B.alr = alr_basis(dim = 4, numerator = c(4,2,3), denominator = 1) B.alr
The log-contrast matrix can be used as basis
parameter in coordinates()
function:
H2.alr = coordinates(X, basis = B.alr) basis(H2.alr)
Building centered log-ratio coordinates can be accomplished by setting parameter basis='clr'
or
H.clr = coordinates(X, basis = 'clr') head(H.clr)
coda.base
allows to define a wide variety of ilr-coordinates: principal components (pc) coordinates, specific user balances coordinates, principal balances (pb) coordinates, balanced coordinates (default's CoDaPack's coordinates).
The default ilr coordinate used by coda.base
are accessible by simply calling function coordinates
without parameters:
H1.ilr = coordinates(X) head(H1.ilr)
Parameter basis
is set to ilr
by default:
all.equal( coordinates(X, basis = 'ilr'), H1.ilr )
Other easily accessible coordinates are the Principal Component (PC) coordinates. PC coordinates define the first coordinate as the log-contrast with highest variance, the second the one independent from the first and with highest variance and so on:
H2.ilr = coordinates(X, basis = 'pc') head(H2.ilr) barplot(apply(H2.ilr, 2, var))
Note that the PC coordinates are independent:
cov(H2.ilr)
The Principal Balance coordinates are similar to PC coordinates but with the restriction that the log contrast are balances
H3.ilr = coordinates(X, basis = 'pb') head(H3.ilr) barplot(apply(H3.ilr, 2, var))
Moreover, they are not independent:
cor(H3.ilr)
Principal Balances are hard to compute when the number of components is very high. coda.base
allows to build PB approximations using different algorithms.
X100 = exp(matrix(rnorm(1000*100), ncol = 100))
PB1.ward = pb_basis(X100, method = 'cluster')
PB1.constrained = pb_basis(X100, method = 'constrained')
We can compare they performance (variance explained by the first balance) with respect to the principal components.
PC_approx = coordinates(X100, cbind(pc_basis(X100)[,1], PB1.ward[,1], PB1.constrained[,1])) names(PC_approx) = c('PC', 'Ward', 'Constrained') apply(PC_approx, 2, var)
Finally, coda.base
allows to define the default CoDaPack basis which consists in defining well balanced balances, i.e. equal number of branches in each balance.
H4.ilr = coordinates(X, basis = 'cdp') head(H4.ilr)
We can define the coordinates directly by providing the log-contrast matrix.
B = matrix(c(-1,-1,2,0, 1,0,-0.5,-0.5, -0.5,0.5,0,0), ncol = 3) H1.man = coordinates(X, basis = B) head(H1.man)
We can also define balances using formula numerator~denominator
:
B.man = sbp_basis(list(b1 = erc~jxcat, b2 = psc~cs, b3 = erc+jxcat~psc+cs), data=X) H2.man = coordinates(X, basis = B.man) head(H2.man)
With sbp_basis
we do not need to define neither a basis nor a system generator
B = sbp_basis(list(b1 = erc+jxcat~psc+cs), data=X) H3.man = coordinates(X, basis = B) head(H3.man)
or
B = sbp_basis(list(b1 = erc~jxcat+psc~cs, b2 = jxcat~erc+psc+cs, b3 = psc~erc+jxcat+cs, b4 = cs~erc+jxcat+psc), data=X) H4.man = coordinates(X, basis = B) head(H4.man)
If interested, we can complete a sequential binary partition giving only some partitions
B = sbp_basis(list(b1 = erc+jxcat~psc), data=X, fill = TRUE) sign(B)
We can also define sequential binary partition using a matrix. By using a matrix we don't need to include a dataset. The number of components is obtained with the number of rows and component names from row names (if available).
P = matrix(c(1, 1,-1,-1, 1,-1, 0, 0, 0, 0, 1,-1), ncol= 3) B = sbp_basis(P) H5.man = coordinates(X, basis = B) head(H5.man)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.