RandomSet: Testing GO terms, KEGG pathways, and other categories with...

RandomSetR Documentation

Testing GO terms, KEGG pathways, and other categories with Random Set method

Description

This function uses the Random Set method to test for enriched biological categories in gene expression data.

Usage

RandomSet(sigvals, geneids, database = "GO", functionalCategories = NULL,
    species = "Hs", min.g = 10, minGenesInCategory = NULL, max.g = NA,
    maxGenesInCategory = NULL, sig.cutoff = 0.1, sigFDR = NULL,
	averageMultipleProbes = TRUE, allGenesInCategoriesAsBackground=TRUE,
	two.sided = FALSE, na.rm=TRUE, verbose = TRUE)
RS(sigvals, geneids, database = "GO", functionalCategories = NULL,
    species = "Hs", min.g = 10, minGenesInCategory = NULL, max.g = NA,
    maxGenesInCategory = NULL, sig.cutoff = 0.1, sigFDR = NULL,
	averageMultipleProbes = TRUE, allGenesInCategoriesAsBackground=FALSE,
	two.sided = FALSE, na.rm=TRUE, verbose = TRUE)

Arguments

sigvals

A vector of p-values, same length and order as "geneids"

geneids

A vector of Entrez gene IDs, may contain duplicates and missing values

min.g

Deprecated. Please use 'minGenesInCategory' instead.

minGenesInCategory

The minimum number of unique gene IDs analyzed in category to be tested, if NULL it is set to 0.

max.g

Deprecated. Please use 'minGenesInCategory' instead.

maxGenesInCategory

The maximum number of unique gene IDs analyzed in category to be tested, if NULL it is set to Inf.

sig.cutoff

Deprecated. Please use 'sigFDR' instead.

sigFDR

Categories with FDR <= sigFDR will be returned. If NULL it is set to 1.

database

Deprecated. Please use 'functionalCategories' instead.

functionalCategories

Functional categories to be tested- currently, options include "GO", "KEGG" and various other categories, default = "GO". Can be provided by function getFunctionalCategories().

species

Species to further specify database, human="Hs", mouse="Mm", rat="Rn", etc. Default ="Hs".

averageMultipleProbes

If multiple probes per geneID, the (geometric) mean is computed.

allGenesInCategoriesAsBackground

If TRUE, all genes in a list of functional categories (e.g. "GO") will be used as background gene list, and computations are limited to intersection of background and genes in "geneids" paramter.

two.sided

If TRUE, the two-sided p-value is computed.

na.rm

If TRUE, potential NAs and NaNs in sigvals are removed before computing the Random Set statistic.

verbose

If TRUE, produces lots of output.

Details

This function uses the Random Set method (Newton et al., 2007) to test for enriched biological categories in gene expression data.

Value

Object is a dataframe with the following columns: category ID category description n.genes - number of genes overlapping between gene list and category zScore - the random set z-score p-value - the corresponding (one-sided or two-sided) p-value FDR - False Discovery Rate (Benjamini & Hochberg, 1995)

Author(s)

Johannes Freudenberg

References

Newton, 2007. Annals App Stat 'Random Set Methods Identify Distinct Aspects of the Enrichment Signal in Gene Set Analysis'

See Also

LRpath, GO.db, KEGG.db, gimmR

Examples

data(gimmOut)
p <- rbeta(94, 0.5, 2)
res <- RandomSet(sigvals=p, geneids=gimmOut$clustData[,1], functionalCategories=c("GO", "KEGG"), species="Rn")
names(res)
head(res[[1]])

uc-bd2k/CLEAN documentation built on Sept. 22, 2022, 4:12 a.m.