calc_pca: Principal Components Analysis Calculation

View source: R/pca.R

calc_pcaR Documentation

Principal Components Analysis Calculation

Description

[Experimental]

The calc_pca() function performs principal components analysis of the gene count vectors across all samples.

A corresponding autoplot() method then can visualize the results.

Usage

calc_pca(object, assay_name = "counts", n_top = NULL)

Arguments

object

(AnyHermesData)
input.

assay_name

(string)
name of the assay to use.

n_top

(count or NULL)
filter criteria based on number of genes with maximum variance.

Details

  • PCA should be performed after filtering out low quality genes and samples, as well as normalization of counts.

  • In addition, genes with constant counts across all samples are excluded from the analysis internally in calc_pca(). Centering and scaling is also applied internally.

  • Plots can be obtained with the ggplot2::autoplot() function with the corresponding method from the ggfortify package to plot the results of a principal components analysis saved in a HermesDataPca object. See ggfortify::autoplot.prcomp() for details.

Value

A HermesDataPca object which is an extension of the stats::prcomp class.

See Also

Afterwards correlations between principal components and sample variables can be calculated, see pca_cor_samplevar.

Examples

object <- hermes_data %>%
  add_quality_flags() %>%
  filter() %>%
  normalize()

result <- calc_pca(object, assay_name = "tpm")
summary(result)

result1 <- calc_pca(object, assay_name = "tpm", n_top = 500)
summary(result1)

# Plot the results.
autoplot(result)
autoplot(result, x = 2, y = 3)
autoplot(result, variance_percentage = FALSE)
autoplot(result, label = TRUE, label.repel = TRUE)

insightsengineering/hermes documentation built on Sept. 19, 2024, 9:06 p.m.