pathway_pca: Perform Principal Component Analysis (PCA) on functional...

View source: R/pathway_pca.R

pathway_pcaR Documentation

Perform Principal Component Analysis (PCA) on functional pathway abundance data

Description

This function performs PCA analysis on pathway abundance data and creates an informative visualization that includes a scatter plot of the first two principal components (PC1 vs PC2) with density plots for both PCs. The plot helps to visualize the clustering patterns and distribution of samples across different groups.

Usage

pathway_pca(abundance, metadata, group, colors = NULL)

Arguments

abundance

A numeric matrix or data frame containing pathway abundance data. Rows represent pathways, columns represent samples. Column names must match the sample names in metadata. Values must be numeric and cannot contain missing values (NA).

metadata

A data frame containing sample information. Must include:

  • A column named "sample_name" matching the column names in abundance

  • A column for grouping samples (specified by the 'group' parameter)

group

A character string specifying the column name in metadata that contains group information for samples (e.g., "treatment", "condition", "group").

colors

Optional. A character vector of colors for different groups. Length must match the number of unique groups. If NULL, default colors will be used.

Details

The function performs several validations on input data:

  • Abundance matrix must have at least 2 pathways and 3 samples

  • All values in abundance matrix must be numeric

  • Sample names must match between abundance and metadata

  • Group column must exist in metadata

  • If custom colors are provided, they must be valid color names or codes

Value

A ggplot object showing:

  • Center: PCA scatter plot with confidence ellipses (95

  • Top: Density plot for PC1

  • Right: Density plot for PC2

Examples

# Create example abundance data
abundance_data <- matrix(rnorm(30), nrow = 3, ncol = 10)
colnames(abundance_data) <- paste0("Sample", 1:10)
rownames(abundance_data) <- c("PathwayA", "PathwayB", "PathwayC")

# Create example metadata
metadata <- data.frame(
  sample_name = paste0("Sample", 1:10),
  group = factor(rep(c("Control", "Treatment"), each = 5))
)

# Basic PCA plot with default colors
pca_plot <- pathway_pca(abundance_data, metadata, "group")

# PCA plot with custom colors
pca_plot <- pathway_pca(
  abundance_data,
  metadata,
  "group",
  colors = c("blue", "red")  # One color per group
)


# Example with real data
data("metacyc_abundance")  # Load example pathway abundance data
data("metadata")          # Load example metadata

# Generate PCA plot
# Prepare abundance data
abundance_data <- as.data.frame(metacyc_abundance)
rownames(abundance_data) <- abundance_data$pathway
abundance_data <- abundance_data[, -which(names(abundance_data) == "pathway")]

# Create PCA plot
pathway_pca(
  abundance_data,
  metadata,
  "Environment",
  colors = c("green", "purple")
)



ggpicrust2 documentation built on April 13, 2025, 9:08 a.m.