Description Usage Arguments Details Author(s) References See Also Examples
Computes the enrichment scores and simulated enrichment scores for
each variable and signature.
An important parameter of the function is logScale
. Its default
value is TRUE which means that by default the provided scores (i.e. fold
changes, hazard ratios) will be log scaled. Remember to change this
parameter to FALSE if your scores are already log scaled.
The getEs
, getEsSim
, getFc
, getHr
and
getFcHr
methods can be used to acces each subobject. For more
information please visit the man pages of each method.
It also computes the NES (normalized enrichment score), p values and fdr
(false discovery rate) for all variables and signatures.
For an overview of the output use the summary
method.
In case of providing gene sets which have more than 10 distinct lengths an approximation of the calculation of the enrichment score simulations (ESM) will be computed. The value of the ESM only depends on the length of the gene set. Therefore we compute the ESM over a grid of possible gene set lengths which are representative of the lengths of the provided gene sets. Then we fit a generalized additive model model with cubic splines to predict the NES value based on the length of every gene set. This provides a much faster approach that can be very useful when we need to run the software over a huge number of gene sets.
1 2 3 4 |
x |
|
gsets |
character or list object containing the names of the genes that belong to each signature. |
logScale |
if values should be log scaled. |
absVals |
if TRUE fold changes and hazard ratios that are negative will be turned into positive before starting the process. This is useful when genes can go in both directions. |
averageRepeats |
if x is of class numeric and has repeated names (several measures for some indivdual names) we can average the measures of the same names. |
B |
number of simulations to perform. |
mc.cores |
number of processors to use. |
test |
the test that will be used. 'perm' stands for the permutation based method, 'wilcox' stands for the wilcoxon test (this is the fastest one) and 'ttperm' stands for permutation t test. |
p.adjust.method |
p adjustment method to be used. Common options
are 'BH', 'BY', 'bonferroni' or 'none'. All available options and
their explanations can be found on the |
pval.comp.method |
the p value computation method. Has to be one of 'signed' or 'original'. The default one is 'original'. See details for more information. |
pval.smooth.tail |
if we want to estimate the tail of the ditribution where the pvalues will be generated. |
minGenes |
gene sets with less than minGenes genes will be removed from the analysis. |
maxGenes |
gene sets with more than maxGenes genes will be removed from the analysis. |
center |
if we want to center scores (fold changes or hazard ratios). The following is will be done: x = x-mean(x). |
The following preprocessing was done on the provided scores (i.e. fold
changes, hazard ratios) to avoid errors during the enrichment score
computation:
-When having two scores with the same name its average was used.
-Zeros were removed.
-Scores without names (which can not be in any signature) removed.
-Non complete cases (i.e. NAs, NaNs) were removed.
ES score was calculated for each signature and variable (see
references). If parameter test
is 'perm' the signature was
permutted and the ES score was recalculated (this happened B times for
each variable, 1000 by default).
If test
is 'wilcox' a wilcoxon test in which we test the fact
that the average value of the genes that do belong to our signtaure is
different from the average value of the genes that do not belong to our
signature will be performed.
If test
is 'ttperm' a permutation t-test will be used.
Take into account that the final plot will be different when 'wilcox' is used.
The simulated enrichment scores and the calculated one are used to find the p value.
P value calculation depends on the parameter
pval.comp.method
. The default value is 'original'. In 'original'
we are simply computing the proportion of anbolute simulated ES which
are larger than the observed absolute ES. In 'signed' we are computing
the proportion of simulated ES which are larger than the observed ES (in
case of having positive enrichment score) and the proportion of
simulated ES which are smaller than the observed ES (in case of having
negative enrichment score).
Evarist Planet
Aravind Subramanian, (October 25, 2005) Gene Set Enrichment Analysis. www.pnas.org/cgi/doi/10.1073/pnas.0506580102
C.A. Tsai and J.J. Chen. Kernel estimation for adjusted p-values in multiple testing. Computational Statistics & Data Analysis http://econpapers.repec.org/article/eeecsdana/v_3a51_3ay_3a2007_3ai_3a8_3ap_3a3885-3897.htm
gsea.go, gsea.kegg
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #load epheno object
data(epheno)
epheno
#we construct two signatures
sign1 <- sample(featureNames(epheno))[1:20]
sign2 <- sample(featureNames(epheno))[50:75]
mySignature <- list(sign1,sign2)
names(mySignature) <- c('My first signature','My preferred signature')
#run gsea functions
gseaData <- gsea(x=epheno,gsets=mySignature,B=100,mc.cores=1)
my.summary <- summary(gseaData)
my.summary
#plot(gseaData)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.