ipca: Independent Principal Component Analysis

Description Usage Arguments Value Author(s) References See Also Examples

Description

Performs independent principal component analysis on the given data matrix, a combination of Principal Component Analysis and Independent Component Analysis.

In PCA, the loading vectors indicate the importance of the variables in the principal components. In large biological data sets, the loading vectors should only assign large weights to important variables (genes, metabolites ...). That means the distribution of any loading vector should be super-Gaussian: most of the weights are very close to zero while only a few have large (absolute) values.

However, due to the existence of noise, the distribution of any loading vector is distorted and tends toward a Gaussian distribtion according to the Central Limit Theroem. By maximizing the non-Gaussianity of the loading vectors using FastICA, we obtain more noiseless loading vectors. We then project the original data matrix on these noiseless loading vectors, to obtain independent principal components, which should be also more noiseless and be able to better cluster the samples according to the biological treatment (note, IPCA is an unsupervised approach).

Algorithm 1. The original data matrix is centered.

2. PCA is used to reduce dimension and generate the loading vectors.

3. ICA (FastICA) is implemented on the loading vectors to generate independent loading vectors.

4. The centered data matrix is projected on the independent loading vectors to obtain the independent principal components.

Usage

1
2
3
4
5
6
7
## S4 method for signature 'ANY'
X, ncomp = 2,  mode = cipca("deflation","parallel"),
fun = c("logcosh", "exp", "kur"), scale = FALSE, w.init = NULL,
max.iter = 200, tol = 1e-04)

## S4 method for signature 'MultiAssayExperiment'
ipca(X, ncomp = 2, ..., assay = NULL)

Arguments

X

A numeric matrix (or data frame) which provides the data for the principal components analysis. It can contain missing values. Alternatively, a MultiAssayExperiment object.

ncomp

Integer, if data is complete ncomp decides the number of components and associated eigenvalues to display from the pcasvd algorithm and if the data has missing values, ncomp gives the number of components to keep to perform the reconstitution of the data using the NIPALS algorithm. If NULL, function sets ncomp = min(nrow(X), ncol(X)).

...

aguments passed to the generic.

assay

Name or index of an assay from X.

mode

character string. What type of algorithm to use when estimating the unmixing matrix, choose one of "deflation", "parallel".

fun

the function used in approximation to neg-entropy in the FastICA algorithm. Default set to logcosh, see details of FastICA.

w.init

initial un-mixing matrix (unlike FastICA, this matrix is fixed here).

Value

ipca returns a list with class "ipca" containing the following components:

ncomp

the number of independent principal components used.

unmixing

the unmixing matrix of size (ncomp x ncomp)

mixing

the mixing matrix of size (ncomp x ncomp)

X

the centered data matrix

x

the indepenent principal components

loadings

the independent loading vectors

kurtosis

the kurtosis measure of the independent loading vectors

Author(s)

Fangzhou Yao and Jeff Coquery.

References

Yao, F., Coquery, J. and LĂȘ Cao, K.-A. (2011) Principal component analysis with independent loadings: a combination of PCA and ICA. (in preparation)

A. Hyvarinen and E. Oja (2000) Independent Component Analysis: Algorithms and Applications, Neural Networks, 13(4-5):411-430

J L Marchini, C Heaton and B D Ripley (2010). fastICA: FastICA Algorithms to perform ICA and Projection Pursuit. R package version 1.1-13.

See Also

sipca, pca, plotIndiv, plotVar, and http://www.mixOmics.org for more details.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#' \dontrun{
## successful: TRUE

library(mixOmics.data)

# implement IPCA on a microarray dataset
ipca.res <- ipca(liver.toxicity$gene, ncomp = 3, mode="deflation")
ipca.res

# samples representation
plotIndiv(ipca.res, ind.names = as.character(liver.toxicity$treatment[, 4]),
          group = as.numeric(as.factor(liver.toxicity$treatment[, 4])))

# example with MultiAssayExperiment class
# --------------------------------

ipca.res <- ipca(liver.toxicity.mae, assay='gene', ncomp = 3, mode="deflation")
ipca.res



  plotIndiv(ipca.res, cex = 1,
            col = as.numeric(as.factor(liver.toxicity$treatment[, 4])),style="3d")

# variables representation with cutoff
plotVar(ipca.res, cex = 1, cutoff = 0.5)


  ## 3d
  plotVar(ipca.res, rad.in = 0.5, cex = 0.5,style="3d", cutoff = 0.8)

  #' }

ajabadi/mixOmics2 documentation built on Aug. 9, 2019, 1:08 a.m.