LRpath | R Documentation |
This function uses logistic regression to test for enriched biological categories in gene expression data. Our method models the probability of a randomly selected gene belonging to a specific category given the significance level of that gene. For categories significantly affected by the experimental condition, this probability will increase as the significance statistic increases. Categories with significant p-values and positive slope coefficients are enriched with differentially expressed genes.
LRpath(sigvals, geneids, min.g=10, max.g=NA, sig.cutoff=0.05, database="GO", functionalCategories=NULL,
odds.min.max=c(0.001,0.5), species="Hs")
sigvals |
A vector of p-values, same length and order as "geneids" |
geneids |
A vector of Entrez gene IDs, may contain duplicates and missing values |
min.g |
The minimum number of unique gene IDs analyzed in category to be tested, default = 10 |
max.g |
The maximum number of unique gene IDs analyzed in category to be tested, default = NA (99999) |
sig.cutoff |
Entrez gene IDs in each category with p-values<sig.cutoff will be returned, default = 0.05 |
database |
Deprecated. Please use 'functionalCategories' instead. |
functionalCategories |
Functional categories to be tested- currently, options include "GO", "KEGG" and various other categories, default = "GO". Can be provided by function getFunctionalCategories(). |
odds.min.max |
Lower and upper p-values to be used for odds ratio calculation, default= c(0.001, 0.5) |
species |
Species to further specify database, human="Hs", mouse="Mm", rat="Rn", etc. Default ="Hs". |
LRpath: testing GO terms or KEGG with logistic regression Written by: Maureen Sartor, University of Cincinnati, 2008
This function uses logistic regression to test for enriched biological categories in gene expression data. Our method models the probability of a randomly selected gene belonging to a specific category given the significance level of that gene. For categories significantly affected by the experimental condition, this probability will increase as the significance statistic increases. Categories with significant p-values and positive slope coefficients are enriched with differentially expressed genes.
Please acknowledge your use of LRpath in publications by referencing the Sartor et al. (2009) paper.
Object is a dataframe with the following columns: GO or KEGG ID - posterior t-value for IBMT GO or KEGG term - name of category Ontology (only for GO) - BP, MF, or CC n.genes - Number of unique Entrez Gene IDs in category coeff - coefficient of slope (only GO terms with positive values are significant) odds.ratio - Odds ratio, as measure of strenth of enrichment p.value - P-value that slope does not equal zero (that term is enriched) FDR - False Discovery Rate (Benjamini & Hochberg, 1995) sig.genes - comma separated Entrez gene ids in category with p-value<"sig.cutoff"
Maureen Sartor
Sartor MA, Leikauf GD, Medvedovic M. 2009. LRpath: A logistic regression approach for identifying enriched biological groups in gene expression data. Bioinformatics 25(2):211-7.
glm
, GO.db
, KEGG.db
, gimmR
data(gimmOut)
p <- rbeta(94, 0.5, 2)
LRpath(sigvals=p, geneids=gimmOut$clustData[,1], functionalCategories="GO", species="Rn")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.