tune.pca: Tune the number of principal components in PCA

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/tune.pca.R

Description

tune.pca can be used to quickly visualise the proportion of explained variance for a large number of principal components in PCA.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
tune.pca(
  X,
  ncomp = NULL,
  center = TRUE,
  scale = FALSE,
  max.iter = 500,
  tol = 1e-09,
  logratio = c("none", "CLR", "ILR"),
  V = NULL,
  multilevel = NULL
)

Arguments

X

a numeric matrix (or data frame) which provides the data for the principal components analysis. It can contain missing values.

ncomp

integer, the number of components to initially analyse in tune.pca to choose a final ncomp for pca. If NULL, function sets ncomp = min(nrow(X), ncol(X))

center

a logical value indicating whether the variables should be shifted to be zero centered. Alternately, a vector of length equal the number of columns of X can be supplied. The value is passed to scale.

scale

a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is FALSE for consistency with prcomp function, but in general scaling is advisable. Alternatively, a vector of length equal the number of columns of X can be supplied. The value is passed to scale.

max.iter

integer, the maximum number of iterations for the NIPALS algorithm.

tol

a positive real, the tolerance used for the NIPALS algorithm.

logratio

one of ('none','CLR','ILR'). Default to 'none'

V

Matrix used in the logratio transformation id provided.

multilevel

Design matrix for multilevel analysis (for repeated measurements).

Details

The calculation is done either by a singular value decomposition of the (possibly centered and scaled) data matrix, if the data is complete or by using the NIPALS algorithm if there is data missing. Unlike princomp, the print method for these objects prints the results in a nice format and the plot method produces a bar plot of the percentage of variance explaned by the principal components (PCs).

When using NIPALS (missing values), we make the assumption that the first (min(ncol(X), nrow(X)) principal components will account for 100 % of the explained variance.

Note that scale= TRUE cannot be used if there are zero or constant (for center = TRUE) variables.

Components are omitted if their standard deviations are less than or equal to comp.tol times the standard deviation of the first component. With the default null setting, no components are omitted. Other settings for comp.tol could be comp.tol = sqrt(.Machine$double.eps), which would omit essentially constant components, or comp.tol = 0.

logratio transform and multilevel analysis are performed sequentially as internal pre-processing step, through logratio.transfo and withinVariation respectively.

Value

tune.pca returns a list with class "tune.pca" containing the following components:

sdev

the square root of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix).

explained_variance

the proportion of explained variance accounted for by each principal component is calculated using the eigenvalues

cum.var

the cumulative proportion of explained variance accounted for by the sequential accumulation of principal components is calculated using the sum of the proportion of explained variance

Author(s)

Ignacio González, Leigh Coonan, Kim-Anh Le Cao, Fangzhou Yao, Florian Rohart, AL J Abadi

See Also

nipals, biplot, plotIndiv, plotVar and http://www.mixOmics.org for more details.

Examples

1
2
3

Example output

Loading required package: MASS
Loading required package: lattice
Loading required package: ggplot2

Loaded mixOmics 6.2.0

Visit http://www.mixOmics.org for more details about our methods.
Any bug reports or comments? Notify us at mixomics at math.univ-toulouse.fr or https://bitbucket.org/klecao/package-mixomics/issues

Thank you for using mixOmics!
Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl_init' failed, running with rgl.useNULL = TRUE 
3: .onUnload failed in unloadNamespace() for 'rgl', details:
  call: fun(...)
  error: object 'rgl_quit' not found 
Eigenvalues for the first 10 principal components, see object$sdev^2: 
      PC1       PC2       PC3       PC4       PC5       PC6       PC7       PC8 
874.77885 463.10098 248.15205 167.57834 141.95703 122.32390 108.76499  77.74595 
      PC9      PC10 
 74.18373  56.45566 

Proportion of explained variance for the first 10 principal components, see object$explained_variance: 
       PC1        PC2        PC3        PC4        PC5        PC6        PC7 
0.28073776 0.14862034 0.07963801 0.05377996 0.04555745 0.03925671 0.03490533 
       PC8        PC9       PC10 
0.02495056 0.02380736 0.01811799 

Cumulative proportion explained variance for the first 10 principal components, see object$cum.var: 
      PC1       PC2       PC3       PC4       PC5       PC6       PC7       PC8 
0.2807378 0.4293581 0.5089961 0.5627761 0.6083335 0.6475902 0.6824956 0.7074461 
      PC9      PC10 
0.7312535 0.7493715 

 Other available components: 
 -------------------- 
 loading vectors: see object$rotation 

Call:
 tune.pca(X = liver.toxicity$gene, center = TRUE, scale = TRUE) 

 for all principal components, see object$sdev, object$explained_variance and object$cum.var

mixOmics documentation built on Nov. 8, 2020, 11:12 p.m.