evaluateGeneSetUncertainty: Quantify gene set uncertainty
In GiANT: Gene Set Uncertainty in Enrichment Analysis

View source: R/evaluateGeneSetUncertainty.R

evaluateGeneSetUncertainty

R Documentation

Quantify gene set uncertainty

Description

A robustness measure that quantifies the uncertainty of a gene set by performing a resampling experiment and can be used in the robustness parameter of gsAnalysis.

Usage

evaluateGeneSetUncertainty(
	...,
	dat,
	geneSet,
	analysis,
	numSamplesUncertainty,
	blockSize = 1,
	k = seq(0.01, 0.99, by=0.01),
	signLevel = 0.05,
	preprocessGeneSet = FALSE,
	cluster = NULL)

Arguments

`...`	Additional parameters for the different steps of the analysis pipeline, depending on the concrete configuration supplied in `analysis`.
`dat`	A numeric matrix of gene expression values for all analyzed genes. Here, each row corresponds to one gene, and each column corresponds to one sample. The rows must be named with the gene names used in the gene sets.
`geneSet`	A vector containing the names of genes in a gene set. All genes set must correspond to the row names of `dat`.
`analysis`	The parameters of the analysis that is applied to the perturbed copies of the gene set. These parameters are described by an object of class `gsAnalysis` as returned by the function `gsAnalysis` or the predefined analysis descriptors in `predefinedAnalyses`.
`numSamplesUncertainty`	The number of resampling experiments which should be applied to estimate the robustness of `geneSet`.
`blockSize`	Number of genes in one resampled block.
`k`	A `vector` of percentages of genes in the randomized gene sets that should be taken from the original gene set. The remaining genes are chosen randomly. For each value a resampling experiment is performed.
`signLevel`	The significance level for the significance assessment of the gene sets (defaults to `0.05`).
`preprocessGeneSet`	Specifies whether the gene sets in `geneSets` should be preprocessed or not. If set to `TRUE`, all genes that are not part of the data set (i.e. not in `rownames(dat)`) are removed from the gene sets.
`cluster`	If the analyses should be applied in parallel for the different values of `k`, this parameter must hold an initialized cluster as returned by `makeCluster`. If this parameter is `NULL`, the analyses are performed sequentially.

Details

The uncertainty analysis repeatedly replaces parts of the original gene sets by random genes and calculating the gene set statistics for these randomized gene sets. This yields a distribution of gene set statistic values for slightly modified variants of the original gene set.

Value

Returns a list (of class uncertaintyResult) with the following elements:

`uncertainty`	The calculated stability of the original gene set.
`confidenceValues`	A matrix of quantiles of `gssValues` (signLevel, 0.5, 1-signLevel). One row for each value in `k`.
`uncertaintyEvaluations`	A list with one entry per value in `k` containing the following elements: Quantiles of `gssValues`: signLevel, 0.5, 1-signLevel. gssValues: A vector of gene set statistic values, one for each randomly sampled gene set. uncertainGeneSets: A matrix containing all partially random gene sets. k: The percentage of genes in the randomized gene sets taken from the original gene set.
`signLevel`	The significance level used for this analysis.
`originalGeneSetValues`	Result of `geneSetAnalysis` for the original `geneSet`.

Examples

data(exampleData)

  res <- evaluateGeneSetUncertainty(
  	# parameters for evaluateGeneSetUncertainty
  	dat = countdata,
  	geneSet = pathways[[1]],
  	analysis = analysis.averageCorrelation(),
  	numSamplesUncertainty = 10,
  	k = seq(0.1,0.9, by=0.1),
  	# additional parameters for analysis.averageCorrelation
  	labs = labels,
  	numSamples = 10)

GiANT documentation built on Sept. 11, 2024, 9:16 p.m.