This document provides information on how to extract subsets of genes from previously available gene lists by setting different filtering conditions such as the fold change, the p-value or the availability of Entrez
identifier.
In principle a filtering tool might read the file header and, once this is done, create an interactive dialog to query for the values that would be applied for subsetting the lists rows or columns.
In practice, and in our work environment most lists will be extracted from the standard output of our microarray analysis pipeline. (In this point we assume that the user is familiarized with standard microarray analysis ``a la Bioconductor''. If this is not so the reader can browse through the slides and examples in \url{http://eib.stat.ub.edu/Omics+Data+Analysis}). These files are generically described as "ExpressionAndTopTables" because they consist of tables having: The Gene Symbols and the Entrez Identifiers in the first two columns The standard output of the limma software known as "topTable" * (optionally) the Expression values that have been used to compute the Toptable.
Although some type of analyses require only the gene identifiers other need also the expressions. For this reason these output files contain ``all that is needed'' for further analyses.
The simplest way to get the data is to load into an R object.
fileName<- system.file("extdata", "topTables.Rda", package = "geneLists") load(fileName) class(AvsB) colnames(AvsB) head(AvsB[,1:7])
The function numGenesChanged
allows one to make an idea of how many genes will be recovered based on a p-value filtering criteria. It is a good idea to start with this function to explore the topTables that will be subsetted. This may help the appropriate filters.
require(geneLists) cbind(numGenesChanged(AvsB, "AvsB"), numGenesChanged(AvsL, "AvsL"), numGenesChanged(BvsL, "BvsL"))
The functions available in the package allow to extract simple gene lists.
entrezs_01_up <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.01, updown="up", id2Select = "ENTREZ", FCcutoff=1, cols2Select =0) length(entrezs_01_up)
ALternatively one can extract a subtable consisting of several columns from the original table
table_01_up <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.01, updown="up", id2Select = NULL, FCcutoff=1, cols2Select =1:3) dim(table_01_up)
A typical situation for a user of this package may consist of some or all the following actions:
AvsB
, AvsL
, BvsL
). Entrez
identifier and remove duplicates keeping only the most variable one.AvsB0 <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 1) AvsL0 <- genesFromTopTable (AvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 1) BvsL0 <- genesFromTopTable (BvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 1)
AvsB1 <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05) AvsL1 <- genesFromTopTable (AvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05) BvsL1 <- genesFromTopTable (BvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05) cat("AvsB: ", length(AvsB0), "-->", length(AvsB1), "\n") cat("AvsL: ", length(AvsL0), "-->", length(AvsL1), "\n") cat("BvsL: ", length(BvsL0), "-->", length(BvsL1), "\n")
AvsB1Up <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="up") AvsL1Up <- genesFromTopTable (AvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="up") BvsL1Up <- genesFromTopTable (BvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="up") cat("AvsB: ", length(AvsB1), "-->", length(AvsB1Up), "\n") cat("AvsL: ", length(AvsL1), "-->", length(AvsL1Up), "\n") cat("BvsL: ", length(BvsL1), "-->", length(BvsL1Up), "\n") AvsB1Down <- genesFromTopTable (AvsB, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="down") AvsL1Down <- genesFromTopTable (AvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="down") BvsL1Down <- genesFromTopTable (BvsL, entrezOnly = TRUE, uniqueIds=TRUE, adjOrrawP = "adj", Pcutoff = 0.05, updown="down") cat("AvsB: ", length(AvsB1), "-->", length(AvsB1Down), "\n") cat("AvsL: ", length(AvsL1), "-->", length(AvsL1Down), "\n") cat("BvsL: ", length(BvsL1), "-->", length(BvsL1Down), "\n")
commonAvsLandBvsL <- intersect(AvsL0, BvsL0) length(commonAvsLandBvsL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.