DaMiR.normalization: Filter non Expressed and 'Hypervariant' features and Data...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/Normalization.R

Description

Features will be firstly filtered based on their expression value and/or by their variability across samples; features will be then normalized.

Usage

1
2
3
4
5
6
7
8
9
DaMiR.normalization(
  data,
  minCounts = 10,
  fSample = 0.5,
  hyper = c("yes", "no"),
  th.cv = 3,
  type = c("vst", "rlog", "logcpm"),
  nFitType = c("parametric", "local", "mean")
)

Arguments

data

A SummarizedExperiment object

minCounts

Minimum reads counts; default is 10

fSample

Fraction of samples with minCounts counts; default is 0.5

hyper

Flag to enable gene filtering by Coefficient of Variation (CV); default is "yes"

th.cv

Threshold of minimum CV to consider a feature 'Hypervariant' accross samples; default is 3

type

Type of normalization to be applied: varianceStabilizingTransformation (vst), rlog or logcpm are allowed; default is "vst"

nFitType

Type of method to estimate the dispersion by vst or rlog. Default is "parametric".

Details

Before normalization step, this function allows the user to filter features by:

Finally, expressed features will be normalized by varianceStabilizingTransformation (default) or rlog, both implemented in DESeq2 package. We suggest to use varianceStabilizingTransformation to speed up the normalization process because rlog is very time-consuming despite the two methods produce quite similar results.

Value

A SummarizedExperiment object which contains a normalized expression matrix (log2 scale) and the data frame with 'class' and (optionally) variables.

Author(s)

Mattia Chiesa, Luca Piacentini

References

Michael I Love, Wolfgang Huber and Simon Anders (2014): Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biology

See Also

varianceStabilizingTransformation, rlog

Examples

1
2
3
4
5
6
# use example data:
data(SE)
# perform normalization on a subset of data:
SE_sub<-SE[1:1000, c(1:3, 21:23)]
data_norm <- DaMiR.normalization(SE_sub, minCounts=10, fSample=0.8,
hyper="yes", th.cv = 2.5)

BioinfoMonzino/DaMiRseq documentation built on Aug. 22, 2021, 3:11 p.m.