significance: Significance assessment

SignificanceAssessmentR Documentation

Significance assessment

Description

Functions to assess the significance of the gene-level statistics, as used in the significance parameter of gsAnalysis. These functions are based on applying the same analysis to randomly modified data sets or gene sets and comparing their statistic values to the original gene set statistic value.

Usage

significance.sampling(
	...,
	dat,
	geneSet,
	analysis,
	glsValues,
	numSamples = 1000)

significance.permutation(
	...,
	dat,
	geneSet,
	analysis,
	glsValues,
	numSamples = 1000,
	labs)

significance.restandardization(
	...,
	dat,
	geneSet,
	analysis,
	glsValues,
	numSamples = 1000,
	labs)

Arguments

...

Additional parameters for the different steps of the analysis pipeline, depending on the concrete configuration supplied in analysis.

dat

The original data set as a numeric matrix of gene expression values for all analyzed genes. Here, each row corresponds to one gene, and each column corresponds to one sample. The rows must be named with the gene names used in the gene sets.

geneSet

The original gene set in form of a vector of gene names corresponding to the row names of dat.

analysis

The analysis applied to the original gene set (that should also be applied to the modified gene sets). This is an object of type gsAnalysis as produced by the function gsAnalysis.

glsValues

A vector containing the (possibly transformed) gene-level statistic values for each gene in the original data set dat.

numSamples

The number of random samples that should be taked to calculate the null distribution for the significance assessment. Default is 1000 for each test.

labs

A vector of class labels for the samples in dat for significance.permutation and significance.restandardization.

Details

Standard methods for the significance assessment of a gene set statistic (to be used in an analysis pipeline defined by gsAnalysis):

  • significance.sampling: This function repeatedly draws random gene sets. Their gene set statistic values form the null distribution.

  • significance.permutation: This function repeatedly permutes the labels of the data set. The gene set statistic values for the original gene set on the permuted data set form the null distribution.

  • significance.restandardization: This function applies both a gene set sampling and a label permutation. The permutation statistic values are standardized by their mean and standard deviation and then restandardized based on the gene set sampling statistic values. These restandardized values form the null distribution (Efron and Tibshirani).

Value

significance.sampling returns a list with the following elements:

gssValues

A vector of gene set statistic values, one entry per sample.

randomGeneSets

A matrix containing the gene sets which were sampled randomly from the set of all genes.

significance.permutation returns a list with the following elements:

gssValues

A vector of gene set statistics, one entry per sample.

permutations

A matrix, where each column contains the indices of one permutation.

significance.restandardization returns a list with the following elements:

gssValues

A vector of gene set statistics, one entry per sample.

samplingValues

A list of sub-lists, each containing one sampling result as defined above.

permutationValues

A list of sub-lists, each containing one permutation result as defined above.

References

Efron, B., Tibshirani, R. (2007) On testing the significance of sets of genes. Annals of Applied Statistics, 1, 107-129.

See Also

geneSetAnalysis, gsAnalysis, hist.gsaResult


GiANT documentation built on Sept. 11, 2024, 9:16 p.m.