Principal component analysis for the ‘YuGene’ class.

Description

Performs a principal components analysis thanks to the pca function of the mixOmics package. The data are centered by study before performing the analysis, if the argument study is given.

Usage

1
2
3
    ## S3 method for class 'YuGene'
pca(X, study, ncomp = 2, center = TRUE, scale = FALSE,
     max.iter = 500, tol = 1e-09,...)

Arguments

X

a numeric matrix (or data frame) which provides the data for the principal components analysis. It can contain missing values.

study

Factor of the study effect.

ncomp

integer, if data is complete ncomp decides the number of components and associated eigenvalues to display from the pcasvd algorithm and if the data has missing values, ncomp gives the number of components to keep to perform the reconstitution of the data using the NIPALS algorithm. If NULL, function sets ncomp = min(nrow(X), ncol(X))

center

a logical value indicating whether the variables should be shifted to be zero centered. Alternately, a vector of length equal the number of columns of X can be supplied. The value is passed to scale.

scale

a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is FALSE for consistency with prcomp function, but in general scaling is advisable. Alternatively, a vector of length equal the number of columns of X can be supplied. The value is passed to scale.

max.iter

integer, the maximum number of iterations in the NIPALS algorithm.

tol

a positive real, the tolerance used in the NIPALS algorithm.

...

not used.

Details

If the argument study is given, the data are centered per study prior to performing the PCA with the pca function of the mixOmics package. Otherwise, the PCA is performed on the input data X.

Value

Same outputs as the pca function from the mixOmics package.
pca returns a list with class "pca" and "prcomp" containing the following components:

ncomp

the number of principal components used.

sdev

the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix or by using NIPALS.

rotation

the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors).

X

if retx is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the rotation matrix) is returned.

center, scale

the centering and scaling used, or FALSE.

Author(s)

KA Le Cao, Translational Research Institute, The University of Queensland Diamantina Institute, Australia
Florian Rohart, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St Lucia, Australia
Leo McHugh, Queensland Facility for Advanced Bioinformatics
Othmar Korn, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St Lucia, Australia
Christine A. Wells, Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St Lucia, Australia

References

Kim-Anh Lê Cao, Florian Rohart, Leo McHugh, Othmar Korn, Christine A. Wells. YuGene: A simple approach to scale gene expression data derived from different platforms for integrated analyses. Genomics. http://dx.doi.org/10.1016/j.ygeno.2014.03.001.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#load data
data(array)

YuGene.data=t(YuGene(t(array$data.all))) # transpose the data to get the samples in columns

#PCA on YuGene data, centered by study
res.pca.yugene.center = pca(YuGene.data, ncomp = 3, scale = TRUE,
            center = TRUE, study = array$experiment.all)
expl.var = round(res.pca.yugene.center$sdev/sum(res.pca.yugene.center$sdev),4)*100

#plot of the results, one color per cell-type, one shape per study
plot(res.pca.yugene.center$x[,1],res.pca.yugene.center$x[,2],
            pch = as.numeric(array$experiment.all),
            col = as.numeric(array$type.all)+1, lwd = 2,
            cex = 1.5, cex.lab = 1.5,xlab=paste("PC1:",expl.var[1],"%"),
            ylab=paste("PC2:",expl.var[2],"%"))
title(paste('YuGene multi group data'), cex.main = 1.5)

#PCA on YuGene data, not centered by study
res.pca.yugene = pca(YuGene.data, ncomp = 3, scale = TRUE, center = TRUE)
expl.var = round(res.pca.yugene$sdev/sum(res.pca.yugene$sdev),4)*100

#plot of the results, one color per cell-type, one shape per study
plot(res.pca.yugene$x[,1],res.pca.yugene$x[,2],
            pch = as.numeric(array$experiment.all),
            col = as.numeric(array$type.all)+1, lwd = 2,
            cex = 1.5, cex.lab = 1.5,X.label=paste("PC1:",expl.var[1],"%"),
            Y.label=paste("PC2:",expl.var[1],"%"))
title(paste('YuGene data'), cex.main = 1.5)