goglm: Implement the GOglm method for GO enrichment analysis

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/goglm.R

Description

Gene Ontology (GO) Enrichment Analysis Using Logistic Regression

Usage

1
  goglm(gene_data, cat2genes, n = 5)

Arguments

gene_data

Output from the prepare function. It contains valid gene identifiers as row names. Two columns are (1) (transformed) DE test p-values (significance statistics) and (2) (transformed) gene lengths.

cat2genes

A list. Entry names are GO terms, and elements are corresponding gene names. This mapping is obtained by the getgo and revMap functions.

n

If a category has fewer than n genes annotated, then this cagtegory will be excluded in the final GO ranking list.

Details

This is the main function that implements the GOglm method for GO enrichment analysis using logistic regression. Users need to specify three arguments, which will be illustrated in the Argument and Examples sections below. A DE test output with DE p-values and gene length information, and a category-to-gene mapping list, are required to implement goglm. In general, the DE test output is obtained by the prepare function, and the mapping list can be obtained by reverse-mapping the results from the getgo function in the goseq.

Value

An object of class goglm to be passed to summary for more readable results. See Examples below.

Author(s)

Gu Mi mig@stat.oregonstate.edu, Yanming Di diy@stat.oregonstate.edu

References

Mi G, Di Y, Emerson S, Cumbie JS and Chang JH (2012) "Length bias correction in Gene Ontology enrichment analysis using logistic regression", PLOS ONE, 7(10): e46128

See Also

summary.goglm which summarizes GOglm results and produces more readable outputs.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## Load the datasets into R session:
data(ProsCan_DE)
DE_data <- ProsCan_DE
data(ProsCan_Length)
Length_data <- ProsCan_Length

## Prepare a data frame to be passed to goglm():
gene_table <- prepare(DE_data, Length_data, trans.p = "d.log", trans.l = TRUE)

## For illustration, only consider a subset of genes:
gene_data <- gene_table[1:100,1:2]

## Prepare the "category-to-genes" list:
library(goseq)
gene2cats <- getgo(rownames(gene_data), "hg18", "ensGene")
cat2genes <- revMap(gene2cats)

## Run goglm():
res <- goglm(gene_data, cat2genes, n=5)
names(res)  # "GOID"   "over.p" "anno"   "rank"

## For more readable outputs:
output <- cbind(res$over.p, res$anno, res$rank)
rownames(output) <- unfactor(res$GOID)
colnames(output) <- c("over.p", "n.anno", "rank")
head(output)

## For a summary of the GOglm results:
summary(res)

gu-mi/GOglm documentation built on May 14, 2019, 7:42 a.m.