plotGeneStats: Plot gene statistics for clusters of gene sets

View source: R/plottingFunctions.R

plotGeneStatsR Documentation

Plot gene statistics for clusters of gene sets

Description

This function plots gene statistics against gene frequencies for any given cluster of gene sets. The plot can be used to identify genes that are over-represented in a cluster of gene-sets (identified based on gene-set overlaps) and have a strong statistic (e.g. log fold-chage or p-value).

Usage

plotGeneStats(
  geneStat,
  msigGsc,
  groups,
  statName = "Gene-level statistic",
  topN = 5
)

Arguments

geneStat

a named numeric, containing the statistic to be displayed. The vector must be named with either gene Symbols or Entrez IDs depending on annotations in msigGsc.

msigGsc

a GeneSetCollection object, containing gene sets from the MSigDB. The GSEABase::getBroadSets() function can be used to parse XML files downloaded from MSigDB.

groups

a named list, of character vectors or numeric indices specifying node groupings. Each element of the list represent a group and contains a character vector with node names.

statName

a character, specifying the name of the statistic.

topN

a numeric, specifying the number of genes to label. The top genes are those with the largest count and statistic.

Value

a ggplot object, plotting the gene-level statistic against gene frequencies in the cluster of gene sets.

Examples

library(GSEABase)

data(hgsc)
groups <- list('g1' = names(hgsc)[1:25], 'g2' = names(hgsc)[26:50])

#create statistics
allgenes = unique(unlist(geneIds(hgsc)))
gstats = rnorm(length(allgenes))
names(gstats) = allgenes

#plot
plotGeneStats(gstats, hgsc, groups)


DavisLaboratory/vissE documentation built on Jan. 31, 2024, 5:02 a.m.