normalize: Normalization of microarray and RNA-seq expression data

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/normalize.R

Description

This function wraps commonly used functionality from limma for microarray normalization and from EDASeq for RNA-seq normalization.

Usage

1
2
    normalize( eset, 
        norm.method = "quantile", within = FALSE, data.type = c(NA, "ma", "rseq") )

Arguments

eset

An object of class SummarizedExperiment.

norm.method

Determines how the expression data should be normalized. For available microarray normalization methods see the man page of the limma function normalizeBetweenArrays. For available RNA-seq normalization methods see the man page of the EDASeq function betweenLaneNormalization. Defaults to 'quantile', i.e. normalization is carried out so that quantiles between arrays/lanes/samples are equal. See details.

within

Logical. Is only taken into account if data.type='rseq'. Determine whether GC content normalization should be carried out (as implemented in the EDASeq function withinLaneNormalization). Defaults to FALSE. See details.

data.type

Expression data type. Use 'ma' for microarray and 'rseq' for RNA-seq data. If NA, data.type is automatically guessed. If the expression values in 'eset' are decimal numbers they are assumed to be microarray intensities. Whole numbers are assumed to be RNA-seq read counts. Defaults to NA.

Details

Normalization of high-throughput expression data is essential to make results within and between experiments comparable. Microarray (intensity measurements) and RNA-seq (read counts) data exhibit typically distinct features that need to be normalized for. For specific needs that deviate from these standard normalizations, the user should always refer to more specific functions/packages.

Microarray data is expected to be single-channel. For two-color arrays, it is expected here that normalization within arrays has been already carried out, e.g. using normalizeWithinArrays from limma.

RNA-seq data is expected to be raw read counts. Please note that normalization for downstream DE analysis, e.g. with edgeR and DESeq, is not ultimately necessary (and in some cases even discouraged) as many of these tools implement specific normalization approaches. See the vignette of EDASeq, edgeR, and DESeq for details.

Value

An object of class SummarizedExperiment.

Author(s)

Ludwig Geistlinger <[email protected]>

See Also

read.eset for reading expression data from file;

normalizeWithinArrays and normalizeBetweenArrays for normalization of microarray data;

withinLaneNormalization and betweenLaneNormalization for normalization of RNA-seq data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
    #
    # (1) simulating expression data: 100 genes, 12 samples
    #
    
    # (a) microarray data: intensity measurements
    ma.eset <- make.example.data(what="eset", type="ma")
    
    # (b) RNA-seq data: read counts
    rseq.eset <- make.example.data(what="eset", type="rseq")

    #
    # (2) Normalization
    #
    
    # (a) microarray ... 
    norm.eset <- normalize(ma.eset) 

    # (b) RNA-seq ... 
    norm.eset <- normalize(rseq.eset) 

    # ... normalize also for GC content
    gc.content <- rnorm(100, 0.5, sd=0.1)
    rowData(rseq.eset)$gc <- gc.content 

    norm.eset <- normalize(rseq.eset, within=TRUE)

EnrichmentBrowser documentation built on Nov. 17, 2017, 9:39 a.m.