View source: R/kernel_functions.R
cLinear | R Documentation |
'cLinear()' is the compositional-linear kernel, which is useful for compositional data (relative frequencies or proportions). 'Aitchison()' is akin to the RBF kernel for this type of data. Thus, the expected input for both kernels is a matrix or data.frame containing strictly non-negative or (even better) positive numbers. This input has dimension NxD, with N>1 samples and D>1 compositional features.
cLinear(X, cos.norm = FALSE, feat_space = FALSE, zeros = "none")
Aitchison(X, g = NULL, zeros = "none")
X |
Matrix or data.frame that contains the compositional data. |
cos.norm |
Should the resulting kernel matrix be cosine normalized? (Defaults: FALSE). |
feat_space |
If FALSE, only the kernel matrix is returned. Otherwise, the feature space is also returned. (Defaults: FALSE). |
zeros |
"none" to warrant that there are no zeroes in X, "pseudo" to replace zeroes by a pseudocount. (Defaults="none"). |
g |
Gamma hyperparameter. If g=0 or NULL, the matrix of squared Aitchison distances is returned instead of the Aitchison kernel matrix. (Defaults=NULL). |
In compositional data, samples (rows) sum to an arbitrary or irrelevant number. This is most clear when working with relative frequencies, as all samples add to 1 (or 100, or other uninformative value). Zeroes are a typical challenge when using compositional approaches. They introduce ambiguity because they can have multiple causes; a zero may signal a true absence, or a value so small that it is below the detection threshold of an instrument. A simple approach to deal with zeroes is replacing them by a pseudocount. More sophisticated approaches are reviewed elsewhere; see for instance the R package 'zCompositions'.
Kernel matrix (dimension: NxN).
Ramon, E., Belanche-Muñoz, L. et al (2021). kernInt: A kernel framework for integrating supervised and unsupervised analyses in spatio-temporal metagenomic datasets. Frontiers in microbiology 12 (2021): 609048. doi: 10.3389/fmicb.2021.609048
data <- soil$abund
## This data is sparse and contains a lot of zeroes. We can replace them by pseudocounts:
Kclin <- cLinear(data,zeros="pseudo")
Kclin[1:5,1:5]
## With the feature space:
Kclin <- cLinear(data,zeros="pseudo",feat_space=TRUE)
## With cosine normalization:
Kcos <- cLinear(data,zeros="pseudo",cos.norm=TRUE)
Kcos[1:5,1:5]
## Aitchison kernel:
Kait <- Aitchison(data,g=0.0001,zeros="pseudo")
Kait[1:5,1:5]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.