Description Usage Arguments Details Value Author(s) References See Also Examples
Estimates GSVA enrichment scores.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | ## S4 method for signature 'SummarizedExperiment,GeneSetCollection'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
## S4 method for signature 'SummarizedExperiment,list'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
## S4 method for signature 'ExpressionSet,list'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
## S4 method for signature 'ExpressionSet,GeneSetCollection'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
## S4 method for signature 'matrix,GeneSetCollection'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
## S4 method for signature 'matrix,list'
gsva(expr, gset.idx.list, annotation,
method=c("gsva", "ssgsea", "zscore", "plage"),
kcdf=c("Gaussian", "Poisson", "none"),
abs.ranking=FALSE,
min.sz=1,
max.sz=Inf,
parallel.sz=1L,
mx.diff=TRUE,
tau=switch(method, gsva=1, ssgsea=0.25, NA),
ssgsea.norm=TRUE,
verbose=TRUE,
BPPARAM=SerialParam(progressbar=verbose))
|
expr |
Gene expression data which can be given either as a
|
gset.idx.list |
Gene sets provided either as a |
annotation |
In the case of calling |
method |
Method to employ in the estimation of gene-set enrichment scores per sample. By default
this is set to |
kcdf |
Character string denoting the kernel to use during the non-parametric estimation of the
cumulative distribution function of expression levels across samples when |
abs.ranking |
Flag used only when |
min.sz |
Minimum size of the resulting gene sets. |
max.sz |
Maximum size of the resulting gene sets. |
parallel.sz |
Number of threads of execution to use when doing the calculations in parallel.
The argument |
mx.diff |
Offers two approaches to calculate the enrichment statistic (ES)
from the KS random walk statistic. |
tau |
Exponent defining the weight of the tail in the random walk performed by both the |
ssgsea.norm |
Logical, set to |
verbose |
Gives information about each calculation step. Default: |
BPPARAM |
An object of class |
GSVA assesses the relative enrichment of gene sets across samples using a non-parametric approach. Conceptually, GSVA transforms a p-gene by n-sample gene expression matrix into a g-geneset by n-sample pathway enrichment matrix. This facilitates many forms of statistical analysis in the 'space' of pathways rather than genes, providing a higher level of interpretability.
By default, gsva()
will try to match the identifiers in expr
to
the identifiers in gset.idx.list
just as they are, unless the annotation
argument is set.
The gsva()
function first maps the identifiers in the gene sets in
gset.idx.list
to the identifiers in the input expression data expr
.
When the input gene sets in gset.idx.list
is provided as a list
object, gsva()
will try to match the identifiers in expr
directly
to the identifiers in gset.idx.list
just as they are. Because unmatching
identifiers will be discarded in both, expr
and gset.idx.list
,
gsva()
may prompt an error if no identifiers can be matched as in the case
of different types of identifiers (e.g., gene symbols vs Entrez identitifers).
However, then the input gene sets in gset.idx.list
is provided as a
GeneSetCollection
object, gsva()
will try to automatically convert
those identifiers to the type of identifier in the input expression data expr
.
Such an automatic conversion, however, will only occur in three scenarios: 1. when
expr
is an ExpressionSet
object with an appropriately set
annotation
slot; 2. when expr
is a SummarizedExperiment
object
with an appropriately set annotation
slot in the metadata of expr
;
3. when expr
is a matrix
and the annotation
argument of the
gsva()
function is set to the name of the annotation package that provides
the relationships between the type of identifiers in expr
and gset.idx.list
.
The collection of gene sets resulting from the previous identifier matching,
can be further filtered to require a minimun and/or maximum size by using the
arguments min.sz
and max.sz
.
A gene-set by sample matrix of GSVA enrichment scores.
J. Guinney and R. Castelo
Barbie, D.A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature, 462(5):108-112, 2009.
Hänzelmann, S., Castelo, R. and Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics, 14:7, 2013.
Lee, E. et al. Inferring pathway activity toward precise disease classification. PLoS Comp Biol, 4(11):e1000217, 2008.
Tomfohr, J. et al. Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics, 6:225, 2005.
filterGeneSets
computeGeneSetsOverlap
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | library(limma)
p <- 10 ## number of genes
n <- 30 ## number of samples
nGrp1 <- 15 ## number of samples in group 1
nGrp2 <- n - nGrp1 ## number of samples in group 2
## consider three disjoint gene sets
geneSets <- list(set1=paste("g", 1:3, sep=""),
set2=paste("g", 4:6, sep=""),
set3=paste("g", 7:10, sep=""))
## sample data from a normal distribution with mean 0 and st.dev. 1
y <- matrix(rnorm(n*p), nrow=p, ncol=n,
dimnames=list(paste("g", 1:p, sep="") , paste("s", 1:n, sep="")))
## genes in set1 are expressed at higher levels in the last 'nGrp1+1' to 'n' samples
y[geneSets$set1, (nGrp1+1):n] <- y[geneSets$set1, (nGrp1+1):n] + 2
## build design matrix
design <- cbind(sampleGroup1=1, sampleGroup2vs1=c(rep(0, nGrp1), rep(1, nGrp2)))
## fit linear model
fit <- lmFit(y, design)
## estimate moderated t-statistics
fit <- eBayes(fit)
## genes in set1 are differentially expressed
topTable(fit, coef="sampleGroup2vs1")
## estimate GSVA enrichment scores for the three sets
gsva_es <- gsva(y, geneSets, mx.diff=1)
## fit the same linear model now to the GSVA enrichment scores
fit <- lmFit(gsva_es, design)
## estimate moderated t-statistics
fit <- eBayes(fit)
## set1 is differentially expressed
topTable(fit, coef="sampleGroup2vs1")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.