run_GSEA_from_signature: Run GSEA from a DE Results Data Frame and a Gene Signature

View source: R/GSEA.R

run_GSEA_from_signatureR Documentation

Run GSEA from a DE Results Data Frame and a Gene Signature

Description

This function performs Gene Set Enrichment Analysis (GSEA) using a differential expression (DE) data frame and a gene signature data frame. The gene signature is split into up-regulated and down-regulated gene sets based on the signcon column. The function returns both the GSEA results object and a sorted, named vector of log2FoldChange values.

Usage

run_GSEA_from_signature(
  de_df,
  signature_df,
  minGSSize = 5,
  maxGSSize = 500,
  pvalueCutoff = 0.05
)

Arguments

de_df

A data frame containing DE results with at least two columns: gene (gene symbol) and log2FoldChange (log-fold change values).

signature_df

A data frame containing the gene signature. It must include at least the columns SYMBOL (gene symbol) and signcon (a numeric value where positive indicates up-regulation and negative indicates down-regulation).

minGSSize

An integer specifying the minimum gene set size to test in GSEA. Default is 5.

maxGSSize

An integer specifying the maximum gene set size to test in GSEA. Default is 500.

pvalueCutoff

A numeric value defining the p-value cutoff for GSEA significance. Default is 0.05.

Details

The function splits the provided signature_df into two groups based on the signcon value: genes with signcon > 0 (Signature_Up) and genes with signcon < 0 (Signature_Down). It creates a TERM2GENE data frame with these two groups, then prepares a ranked gene vector from the de_df (ensuring there are no NA or duplicated entries for gene symbols). Finally, it performs GSEA using the clusterProfiler package.

Value

A list with two components:

gsea_result

An object containing the GSEA results from clusterProfiler::GSEA.

ranked_genes

A named vector of log2 fold-change values sorted in decreasing order.

Examples

## Not run: 
# Read in your DE results and signature file
de_df <- read.csv("path/to/DE_results.csv")
signature_df <- read.csv("path/to/signature.csv")

# Run GSEA
results <- run_GSEA_from_signature(de_df, signature_df)

# Access the GSEA result and ranked genes
gsea_result <- results$gsea_result
ranked_genes <- results$ranked_genes

# Plot the enrichment for one gene set
library(enrichplot)
gseaplot2(gsea_result, geneSetID = "Signature_Down", title = "GSEA Plot: Signature_Down")

## End(Not run)


eisascience/scCustFx documentation built on June 2, 2025, 3:59 a.m.