get_differential_expression_values: Retrieve differential expression results

get_differential_expression_valuesR Documentation

Retrieve differential expression results

Description

Retrieves the differential expression result set(s) associated with the dataset. To get more information about the contrasts in individual resultSets and annotation terms associated them, use get_dataset_differential_expression_analyses()

Usage

get_differential_expression_values(
  dataset = NA_character_,
  resultSets = NA_integer_,
  keepNonSpecific = FALSE,
  readableContrasts = FALSE,
  memoised = getOption("gemma.memoised", FALSE)
)

Arguments

dataset

A dataset identifier.

resultSets

resultSet identifiers. If a dataset is not provided, all result sets will be downloaded. If it is provided it will only be used to ensure all result sets belong to the dataset.

keepNonSpecific

logical. FALSE by default. If TRUE, results from probesets that are not specific to the gene will also be returned.

readableContrasts

If FALSE (default), the returned columns will use internal constrasts IDs as names. Details about the contrasts can be accessed using get_dataset_differential_expression_analyses. If TRUE IDs will be replaced with human readable contrast information.

memoised

Whether or not to save to cache for future calls with the same inputs and use the result saved in cache if a result is already saved. Doing options(gemma.memoised = TRUE) will ensure that the cache is always used. Use forget_gemma_memoised to clear the cache.

Details

In Gemma each result set corresponds to the estimated effects associated with a single factor in the design, and each can have multiple contrasts (for each level compared to baseline). Thus a dataset with a 2x3 factorial design will have two result sets, one of which will have one contrast, and one having two contrasts.

The methodology for differential expression is explained in Curation of over 10000 transcriptomic studies to enable data reuse. Briefly, differential expression analysis is performed on the dataset based on the annotated experimental design with up two three potentially nested factors. Gemma attempts to automatically assign baseline conditions for each factor. In the absence of a clear control condition, a baseline is arbitrarily selected. A generalized linear model with empirical Bayes shrinkage of t-statistics is fit to the data for each platform element (probe/gene) using an implementation of the limma algorithm. For RNA-seq data, we use weighted regression, applying the voom algorithm to compute weights from the mean–variance relationship of the data. Contrasts of each condition are then computed compared to the selected baseline. In some situations, Gemma will split the data into subsets for analysis. A typical such situation is when a ‘batch’ factor is present and confounded with another factor, the subsets being determined by the levels of the confounding factor.

Value

A list of data tables with differential expression values per result set.

Examples

get_differential_expression_values("GSE2018")

jsicherman/Gemma-API documentation built on Dec. 15, 2024, 9:01 a.m.