tarq: An Accurate Normalization Method for High Throughput...

View source: R/tarq.R

tarqR Documentation

An Accurate Normalization Method for High Throughput Sequencing Data

Description

Estimates scaling factors using the trimmed average of ratios of quantiles (TARQ) method

Usage

tarq(X, tau = 0.3)

Arguments

X

a matrix of raw counts. Rows are for taxa (genes, transcripts) and columns for samples

tau

a numerical value in (0, 0.5). The upper \tau/2 \times 100% and the lower \tau/2 \times 100% of the ratios of quantiles are trimmed

Details

Estimation of scaling factors for NGS read counts data is challenging. TARQ provides a quantile-based method for estimating scaling factors. It starts by ordering the raw counts sample by sample and constructs a reference sample from these ordered counts. To compute the scaling factor for a sample, ratios of its quantiles to those of the reference sample are formed. Zero ratios are removed. Then extreme ratios (too large or too small) are trimmed before taking average over the remaining ratios.

Value

a vector of scaling factors. Normalized counts can be obtained by sweep(X, 2, scale.factors, FUN="/")

Author(s)

Kai Wang <kai-wang@uiowa.edu>

References

Wang, K. (2018) An Accurate Normalization Method for Next-Generation Sequencing Data. Submitted.

Examples


#data(throat.otu.tab)
#data(throat.meta)
#otu.tab = t(throat.otu.tab)
#tarq(otu.tab, 0.3)

##### Use TARQ with DESeq2 
#dds <- DESeqDataSetFromMatrix(countData = otu.tab,
 #                             colData = throat.meta,
 #                             design= ~ SmokingStatus)
#sizeFactors(dds) <- tarq(otu.tab, 0.3)
#dds <- DESeq(dds)                 
#results(dds)
#
###### Use TARQ with edgeR
#cs <- colSums(otu.tab)
#scale.factors <- tarq(otu.tab, 0.3)
#tmp <- scale.factors/cs
#norm.factors <- tmp/exp(mean(log(tmp)))
#dgList <- DGEList(counts = otu.tab, genes=rownames(otu.tab), norm.factors = norm.factors)
#designMat <- model.matrix(~ throat.meta$SmokingStatus)
#dgList <- estimateGLMCommonDisp(dgList, design=designMat)
#fit <- glmFit(dgList, designMat)
#glmLRT(fit, coef=2)


iGasso documentation built on Aug. 8, 2023, 5:11 p.m.