runPSEA: Run Protein Set Enrichment Analysis (PSEA)

View source: R/runPSEA.R

runPSEAR Documentation

Run Protein Set Enrichment Analysis (PSEA)

Description

This is the main function to run protein set enrichment analysis for a list of proteins and their score.

Usage

runPSEA(
  protein,
  os.name,
  blist = NULL,
  pexponent = 1,
  nperm = 1000,
  p.adj.method = "fdr",
  sig.level = 0.05,
  minSize = 1
)

Arguments

protein

A dataframe with two columns. Frist column should be protein accession code, second column is the score.

os.name

A character vector of length one with exact taxonomy name of species. If you do not know the the exact taxonomy name of species you are working with, please read getTaxonomyName.

blist

The background list will be substituted with the complete set of UniProt reviewed proteins to facilitate the analysis with a background list. The default value is NULL. Alternatively, if a vector of UniProt Accession Codes is provided, it will serve as the background list for the enrichment analysis.

pexponent

Enrichment weighting exponent, p. For values of p < 1, one can detect incoherent patterns in a set of protein. If one expects a small number of proteins to be coherent in a large set, then p > 1 is a good choice.

nperm

Number of permutation to estimate false discovery rate (FDR). Default value is 1000.

p.adj.method

The adjustment method to correct pvalues for multiple testing in enrichment. Run p.adjust.methods() to get a list of possible methods.

sig.level

The significance level to filter PTM (applies on adjusted p-value)

minSize

PTMs with the number of proteins below this threshold are excluded.

Value

Returns a list of 6: 1: A dataframe with protein set enrichment analysis (PSEA) results. Every row corresponds to a post-translational modification (PTM) keyword.

  • PTM: PTM keyword

  • pval: p-value obtained from singular enrichment analysis (SEA).

  • pvaladj: adjusted p-value. This column is the adjusted pvalues with p.adj.method methods calculated in SEA method.

  • FreqinPopulation: The frequency of PTM in UniProt.

  • FreqinSample: The frequency of PTM in the given list.

  • ES: enrichment score.

  • NES: enrichmnt score normalized to mean enrichment of random samples of the same size.

  • nMoreExtreme: number of times the permuted sample resulted in a profile with a larger ES value than abs(ES) of the sample.

  • size: Number of proteins in the list having this specific PTM.

  • Enrichment: Indicates if the proteins with the specific protein have been enriched in the list or not. NES positive is considered as enriched.

  • AC: Uniprot accession code (AC) of proteins with the specific PTM.

  • leadingEdge: the leading edge proteins are the proteins that show up in the ranked list at or before the point where the enrichment score (ES) reaches its maximum deviation from zero.

Examples

# We recommend at least nperm = 1000.
# The number of permutations was reduced to 10
# to accommodate CRAN policy on examples (run time <= 5 seconds).
psea_res <- runPSEA(protein = exmplData2, os.name = 'Rattus norvegicus (Rat)', nperm = 10)

PEIMAN2 documentation built on June 8, 2025, 1:03 p.m.