gsvaParam-class: 'gsvaParam' class

gsvaParam-classR Documentation

gsvaParam class

Description

Method-specific parameters for the GSVA method.

Objects of class gsvaParam contain the parameters for running the GSVA method.

Usage

gsvaParam(
  exprData,
  geneSets,
  assay = NA_character_,
  annotation = NA_character_,
  minSize = 1,
  maxSize = Inf,
  kcdf = c("Gaussian", "Poisson", "none"),
  tau = 1,
  maxDiff = TRUE,
  absRanking = FALSE
)

Arguments

exprData

The expression data. Must be one of the classes supported by GsvaExprData. Type help(GsvaExprData) to consult the available classes.

geneSets

The gene sets. Must be one of the classes supported by GsvaGeneSets.

assay

The name of the assay to use in case exprData is a multi-assay container, otherwise ignored. By default, the first assay is used.

annotation

The name of a Bioconductor annotation package for the gene identifiers occurring in the row names of the expression data matrix. This can be used to map gene identifiers occurring in the gene sets if those are provided in a GeneSetCollection. By default gene identifiers used in expression data matrix and gene sets are matched directly.

minSize

Minimum size of the resulting gene sets after gene identifier mapping. By default, the minimum size is 1.

maxSize

Maximum size of the resulting gene sets after gene identifier mapping. By default, the maximum size is Inf.

kcdf

Character vector of length 1 denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples. By default, kcdf="Gaussian" which is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log-CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to kcdf="Poisson".

tau

Numeric vector of length 1. The exponent defining the weight of the tail in the random walk performed by the GSVA (Hänzelmann et al., 2013) method. The default value is 1 as described in the paper.

maxDiff

Logical vector of length 1 which offers two approaches to calculate the enrichment statistic (ES) from the KS random walk statistic.

  • FALSE: ES is calculated as the maximum distance of the random walk from 0.

  • TRUE (the default): ES is calculated as the magnitude difference between the largest positive and negative random walk deviations.

absRanking

Logical vector of length 1 used only when maxDiff=TRUE. When absRanking=FALSE (default) a modified Kuiper statistic is used to calculate enrichment scores, taking the magnitude difference between the largest positive and negative random walk deviations. When absRanking=TRUE the original Kuiper statistic that sums the largest positive and negative random walk deviations, is used. In this latter case, gene sets with genes enriched on either extreme (high or low) will be regarded as ’highly’ activated.

Details

In addition to the two common parameter slots inherited from ⁠[GsvaMethodParam]⁠, this class has slots for the two method-specific parameters of the GSVA method described below.

In addition to an expression data set and a collection of gene sets, GSVA takes four method-specific parameters as described below.

Value

A new gsvaParam object.

Slots

kcdf

Character vector of length 1 denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples. kcdf="Gaussian" is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log-CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to kcdf="Poisson".

tau

Numeric vector of length 1. The exponent defining the weight of the tail in the random walk performed by the GSVA (Hänzelmann et al., 2013) method.

maxDiff

Logical vector of length 1 which offers two approaches to calculate the enrichment statistic (ES) from the KS random walk statistic.

  • FALSE: ES is calculated as the maximum distance of the random walk from 0.

  • TRUE: ES is calculated as the magnitude difference between the largest positive and negative random walk deviations.

absRanking

Logical vector of length 1 used only when mx.diff=TRUE. When abs.ranking=FALSE a modified Kuiper statistic is used to calculate enrichment scores, taking the magnitude difference between the largest positive and negative random walk deviations. When abs.ranking=TRUE the original Kuiper statistic that sums the largest positive and negative random walk deviations, is used. In this latter case, gene sets with genes enriched on either extreme (high or low) will be regarded as ’highly’ activated.

References

Hänzelmann, S., Castelo, R. and Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics, 14:7, 2013. DOI

See Also

GsvaExprData, GsvaGeneSets, GsvaMethodParam, plageParam, zscoreParam, ssgseaParam

Examples

library(GSVA)
library(GSVAdata)

data(leukemia)
data(c2BroadSets)

## for simplicity, use only a subset of the sample data
ses <- leukemia_eset[1:1000, ]
gsc <- c2BroadSets[1:100]
gp1 <- gsvaParam(ses, gsc)
gp1



rcastelo/GSVA documentation built on April 29, 2024, 11:26 a.m.