Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/fisherGOProfiles.R
Given two lists of genes, both characterized by their frequencies of annotations
(or "hits") in the same set of GO nodes (also designated as GO terms or GO classes),
for each node determine if the annotation frequencies depart from what is expected
by chance. The annotation frequencies are specified in the "GO profiles" arguments
pn
, qm
and pn
.
Both samples may share a common subsample of genes, with GO profile
pqn0
. The analysis is based on the Fisher's exact test, as is
implemented by fisher.test
R function, followed by p-value adjustment for
multitesting based on function p.adjust
. Usually, this function will be
called after a significant result on compareGOProfiles
which performs
global (all GO nodes simultaneously) profile comparisons (with better
type I and type II error control), to identify the more rellevant nodes.
1 2 3 4 5 6 7 8 9 10 11 12 13 | fisherGOProfiles(pn, ...)
## S3 method for class 'numeric'
fisherGOProfiles(pn, qm=NULL, pqn0=NULL,
n = ngenes(pn), m = ngenes(qm), n0 = ngenes(pqn0),
method = "BH", simplify=T, expanded=F, ...)
## S3 method for class 'matrix'
fisherGOProfiles(pn, n, m, method = "BH", ...)
## S3 method for class 'BasicGOProfile'
fisherGOProfiles(pn, qm=NULL, pqn0=NULL,
method = "BH", goIds=T, ...)
## S3 method for class 'ExpandedGOProfile'
fisherGOProfiles(pn, qm=NULL, pqn0=NULL,
method = "BH", simplify=T, ...)
|
pn |
an object of class |
qm |
similarly, an object representing a "sample" GO profiles for a fixed ontology |
pqn0 |
an object representing a "sample" GO profile for a fixed ontology |
n |
the number of genes profiled in pn |
m |
the number of genes profiled in qm |
n0 |
the number of genes profiled in pqn0 |
method |
the p-values adjusting method for multiple comparisons; the same
possibilities as in standard R function |
expanded |
boolean; are these numeric vectors representing expanded profiles? |
simplify |
should the result be simplified, if possible? See the 'Details' section |
goIds |
if TRUE, each node is represented by its GO identifier |
... |
other arguments (to be passed to |
Given a list of n
genes, and a set of s
GO classes or nodes
X, Y, Z, ... in a given ontology
(BP, MF or CC), its associated ("contracted" or "basic") "profile" is the
absolute frequencies vector of annotations or hits of the n
genes in each
one of the s
GO nodes.
For a given node, say X, this frequency includes all annotations for X alone, for X and Y,
for X and Z and so on. Thus, as relative frequencies, its sum is not necessarily one,
or as absolute frequencies their sum is not necessarily n
.
On the other hand, an "expanded profile" corresponds to the relative frequencies
in ALL NODE COMBINATIONS. That is, if n
genes have been profiled, the
expanded profile stands
for the frequency of all hits EXCLUSIVELY in node X, exclusively in node Y,
exclusively in Z, ..., jointly with
all hits simultaneously in nodes X and Y (and only in X and Y), simultaneously in X and Z,
in Y and Z, ... , in X and Y and Z (and only in X,Y,Z), and so on.
Thus, their sum is one.
Let n
, m
and n0
designate the total number of genes
profiled in pn
, qm
and pqn0
respectively.
According to these profiles, n[i], m[i] and n0[i] genes are annotated
for node 'i', i = 1, ..., s
. Note that the sum of all the n[i] not
necessarily equals n
and so on.
If not NULL, pqn0
stands for the profile of the n0
genes common to the gene lists that gave rise to pn
and qm
.
fisherGOProfiles
builds a s
x2 absolute frequencies matrix
GO node 1 | N[1,1] | N[1,2] |
GO node 2 | N[2,1] | N[2,2] |
... | ... | ... |
GO node s | N[2,1] | N[s,2] |
with column totals N1 and N2 (not necessarily equal to the column sums) and performs a Fisher's exact test over each one of the 2x2 tables
GO node i | N[i,1] | N[i,2] |
All nodes except i | N1 - N[i,1] | N2 - N[i,2] |
followed by a p-value correction for multiplicity in testing.
If pqn0
is NULL, then both gene lists do not have any genes in common,
N[i,1] = n[i] and N[i,2] = m[i], and N1 = n, N2 = m, n0 = 0.
Otherwhise (if pqn0
is not NULL) N[i,1] = n[i] - n0[i], N1 = n - n0 and
N[i,2] = n0[i], N2 = n0 if qm
is NULL, or N[i,2] = m[i], N2 = m if qm
is not NULL.
In other words, this function provides a general setting for diverse, common
in practice, situations where a node-by-node analysis is required.
When pqn0
= NULL, two lists with no genes in common are compared.
Otherwise, when qm
= NULL, the genes profiled in pn
are compared
with a subsample of them, those profiled in pqn0
(a set of genes vs a restricted subset,
e.g. those overexpressed under a disease). Finally, if both arguments qm
and pqn0
are not NULL (pn
is always required) two gene lists with
some genes in common are analised.
If both qm
and pqn0
are NULL, pn
should correspond to an
absolute frequencies matrix with s
rows and 2 columns.
The arguments n
, m
or n0
are only required in case of
numeric vectors or matrices specifying profiles but lacking the 'ngenes' attribute.
A list containing max(ncol(pn),ncol(qm),ncol(pqn0)) p-values numeric vectors, or a single p-values vector if max(ncol(pn),ncol(qm),ncol(pqn0))==1 and simplify == T.
Jordi Ocana
Sanchez-Pla, A., Salicru M. and Ocana, J. Statistical methods for the analysis of highthroughput data based on functional profiles derived from the gene ontology. Journal of Statistical Planning and Inference, 2007.
fitGOProfile, compareGOProfiles, equivalentGOProfiles
1 2 3 4 5 6 7 8 9 | require("org.Hs.eg.db")
data(prostateIds)
# To improve speed, use only the first 100 genes:
list1 <- welsh01EntrezIDs[1:100]
list2 <- singh01EntrezIDs[1:100]
prof1 <- basicProfile(list1, onto="MF", level=2, orgPackage="org.Hs.eg.db")$MF
prof2 <- basicProfile(list2, onto="MF", level=2, orgPackage="org.Hs.eg.db")$MF
commProf<-basicProfile(intersect(list1, list2), onto="MF",level=2, orgPackage="org.Hs.eg.db")$MF
fisherGOProfiles(prof1, prof2, commProf, method="holm")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.