MetICA: MetICA simulations on metabolomics data

Description Usage Arguments Details Value Author(s) References Examples

Description

The main function for MetICA simulation on a sample × variables (n × p) metabolomics data matrix.

Usage

1
2
MetICA(X, pcs = 15, max_iter = 400, boot.prop = 0.3, max.cluster = 20,
  trends = T, verbose = T)

Arguments

X

A numeric matrix obtained from metabolomics expriments. Its dimension should be n (samples) × p (metabolic features), either centered or not. No missing value is allowed. Normalized data is not recommended for MetICA!

pcs

Number of principal components used to whiten the data before ICA, also number of components estimated in each IPCA run. It should be at least 3. Its value can be modified after the function is launched and percentage of variance explained is calculated. We recommend that PCA whitening should keep at least 80 percent of total variance.

max_iter

Number of IPCA iterations. It should be at least 50 to provide reliable results. More than 500 runs can lead to long computational time. To avoid computer memory issues, the total number of estimates (pcs × max_iter) must be under 25 000.

boot.prop

Proportion of samples replaced in bootstrap iterations (when X is resampled). It should not exceed 0.4.

max.cluster

The number of clusters in HCA of estimated components is evaluated from 2 to max.cluster. Its value can be modified in the function if one cluster contains fewer than 30 estimates.

trends

Boolean variable. TRUE if your observations are time-dependent (e.g. blood samples taken over a period of time from a patient).

verbose

Boolean variable. If TRUE the completion of each stage of the algorithm will be reminded by a message.

Details

MetICA is a three-stage algorithm:

Value

A model (a list object) that contains results from each stage of the simulation

Author(s)

Youzhong Liu, Youzhong.Liu@uantwerpen.be

References

A. Hyvarinen and E. Oja, Independent Component Analysis: Algorithms and Applications, Neural Networks (2000) vol. 13 no. 4-5

Fangzhou Yao, Jeff Coquery and Kim-Anh Le Cao, Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets, BMC Bioinformatics (2012) Vol. 13 no. 24

Youzhong Liu, Kirill Smirnov, Marianna Lucio, Regis D. Gougeon, Herve Alexandre and Philippe Schmitt-Kopplin, MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics, BMC Bioinformatics (2016) Vol. 17 no. 114

Examples

1
2
3
4
5
6
7
data(bacteria_peptides)
# Perform 100 IPCA simulations on centered metabolomics data:
M1=MetICA(bacteria_peptides$X,pcs = 20,max_iter = 100,boot.prop = 0.3,max.cluster = 40,trends = T)
# Generate validation plots along with geometric index calculation to help decide number of clusters
validationPlot(M1)
# According to the validation, we now choose 10 components:
M2=MetICA_extract_model(M1,10,tops=7)

daniellyz/MetICA2 documentation built on May 16, 2019, 11:11 p.m.