pca_tidiers: Tidying methods for a Principal Component Analysis
In jiho/autoplot: Automatic plotting with ggplot2

Description Usage Arguments Details Value See Also Examples

Extract diagnostics, coordinates on the principal components (i.e. rows scores and columns loadings) and some fit statistics from a Principal Component Analysis.

1
2
3

tidy(x, ...)

augment(x, data=NULL, dimensions=c(1,2), which="row", scaling=which, ...)

`x`	an object returned by a function performing Principal Component Analysis.
`...`	ignored.
`data`	the original dataset, to be concatenated with the output when extracting row scores. When `NULL` (the default) data will be extracted from the PCA object when it contains it (i.e. for all functions but `prcomp`).
`dimensions`	vector giving the indexes of the principal components to extract. Typically two are extracted to create a plot.
`which`	the type of coordinates in the new space to extract: either "rows", "lines", "observations", "objects", "individuals", "sites" (which are all treated as synonyms) or "columns", "variables", "descriptors", "species" (which are, again, synonyms). All can be abbreviated. By default, coordinates of rows are returned. Row coordinates are commonly called 'scores' and column coordinates usually called 'loadings'.
`scaling`	scaling for the scores. Can be "none" (or 0) for raw scores, "rows" (or 1, or a synonym of "rows") to scale row scores by the eigenvalues, "columns" (or 2, or a synonym of "columns") to scale column scores by the eigenvalues, "both" (or 3) to scale both row and column scores. By default, scaling is adapted to the type of scores extracted (scaling 1 for row scores, scaling 2 for column scores, and scaling 3 when scores are extracted for a biplot).

Scaling of scores follows the conventions of package FactoMineR. In summary, scaling 0 yields unscaled scores, in scaling 1, row scores are multiplied by

sqrt(n * eig)

where n is the number of active rows in the ordination and eig are the eigenvalues. In scaling 2, column scores are multiplied by

sqrt(eig)

In scaling 3 both rows and columns are scaled.

For tidy, a data.frame containing the variance (i.e. eigenvalue), the proportion of variance, and the cumulative proportion of variance associated to each principal component.

For augment, a data.frame containing the original data (when which="rows" and data is supplied or can be extracted from the object) and the additional columns:

.rownames:: the identifier of the row or column, extracted from the row or column names in the original data.
.PC#:: the coordinates of data objects on the extracted principal components.
.cos2:: the squared cosine, summed over extracted PCs, which quantifies the quality of the representation of each data point in the space of the extracted PCs. NB: cos2 can only be computed when all possible principal components are extracted in the PCA objects; when it is not the case, cos2 is NA. In several packages, the number of principal components to keep is an argument of the PCA function (and the default is not "all").
.contrib:: the contribution of each object to the selected PCs. NB: same comment as for cos2 regarding the number of PCs kept in the PCA object.
.type:: the nature of the data extracted : row or col.

Functions to perform PCA: prcomp in package stats, PCA in package factoMineR, rda in package vegan, dudi.pca in package ade4, pca in package pcaMethods (on bioconductor).

Other PCA.related.functions: autoplot_pca, ca_tidiers

pca <- prcomp(USArrests, scale = TRUE)

tidy(pca)

head(augment(pca, which="row"))
head(augment(pca, which="col"))
# or use your preferred synonym, possibly abbreviated
head(augment(pca, which="obs"))
head(augment(pca, which="var"))
head(augment(pca, which="descriptors"))

# data is not contained in the `prcomp` object but can be provided
head(augment(pca, data=USArrests, which="row"))
# select different principal components
augment(pca, which="col", dim=c(2,3))

if (require("FactoMineR")) {
  pca <- FactoMineR::PCA(USArrests, graph=FALSE, ncp=4)
  head(augment(pca, which="individuals"))
  head(augment(pca, which="variables"))
}

if (require("vegan")) {
  pca <- vegan::rda(USArrests, scale=TRUE)
  # can use vegan's naming convention
  head(augment(pca, which="sites"))
  head(augment(pca, which="species"))
}

if (require("ade4")) {
  pca <- ade4::dudi.pca(USArrests, scannf=FALSE)
  head(augment(pca))
  head(augment(pca, which="variables"))
}

if (require("pcaMethods")) {
  pca <- pcaMethods::pca(USArrests, scale="uv")
  head(augment(pca))
  augment(pca, which="var")
}