normaliseGeneExpression: Filter and normalise gene expression

View source: R/data_geNormalisationFiltering.R

normaliseGeneExpressionR Documentation

Filter and normalise gene expression

Description

Gene expression is filtered and normalised in the following steps:

  • Filter gene expression;

  • Normalise gene expression with calcNormFactors;

  • If performVoom = FALSE, compute counts per million (CPM) using cpm and log2-transform values if log2transform = TRUE;

  • If performVoom = TRUE, use voom to compute log2-CPM, quantile-normalise (if method = "quantile") and estimate mean-variance relationship to calculate observation-level weights.

Usage

normaliseGeneExpression(
  geneExpr,
  geneFilter = NULL,
  method = "TMM",
  p = 0.75,
  log2transform = TRUE,
  priorCount = 0.25,
  performVoom = FALSE
)

normalizeGeneExpression(
  geneExpr,
  geneFilter = NULL,
  method = "TMM",
  p = 0.75,
  log2transform = TRUE,
  priorCount = 0.25,
  performVoom = FALSE
)

Arguments

geneExpr

Matrix or data frame: gene expression

geneFilter

Boolean: filtered genes (if NULL, skip filtering)

method

Character: normalisation method, including TMM, RLE, upperquartile, none or quantile (see Details)

p

numeric value between 0 and 1 specifying which quantile of the counts should be used by method="upperquartile".

log2transform

Boolean: perform log2-transformation?

priorCount

Average count to add to each observation to avoid zeroes after log-transformation

performVoom

Boolean: perform mean-variance modelling (using voom)?

Details

edgeR::calcNormFactors will be used to normalise gene expression if method is TMM, RLE, upperquartile or none. If performVoom = TRUE, voom will only normalise if method = "quantile".

Available normalisation methods:

  • TMM is recommended for most RNA-seq data where more than half of the genes are believed not differentially expressed between any pair of samples;

  • RLE calculates the median library from the geometric mean of all columns and the median ratio of each sample to the median library is taken as the scale factor;

  • upperquartile calculates the scale factors from a given quantile of the counts for each library, after removing genes with zero counts in all libraries;

  • quantile forces the entire empirical distribution of each column to be identical (only performed if performVoom = TRUE).

Value

Filtered and normalised gene expression

See Also

Other functions for gene expression pre-processing: convertGeneIdentifiers(), filterGeneExpr(), plotGeneExprPerSample(), plotLibrarySize(), plotRowStats()

Examples

geneExpr <- readFile("ex_gene_expression.RDS")
normaliseGeneExpression(geneExpr)

nuno-agostinho/psichomics documentation built on Feb. 11, 2024, 11:16 p.m.