zeroWeightsLS: Estimate ZINB count component posterior probabilities

Description Usage Arguments References Examples

View source: R/methods.R

Description

Estimate posterior probabilities to belong to the count component according to a zero-inflated negative binomial (ZINB) model. Internally, edgeR is used for the estimation of the NB component.

Usage

1
2
3
zeroWeightsLS(counts, design, maxit = 200, normalization = "TMM", colData,
  designFormula, normFactors = NULL, plot = FALSE, plotW = FALSE,
  verbose = TRUE, designZI = NULL, llTol = 1e-04, llOffset = 1e-06)

Arguments

counts

A count matrix with feature-wise expression values. Values in this matrix must be integers.

design

Design matrix specifying the experimental design.

maxit

The number of iterations for the EM-algorithm. 200 by default, but larger may be useful for large datasets (many samples). Convergence of the posterior probabilities can be checked by following the distribution of posterior probabilities over iterations with plotW. The EM-algorithm will automatically stop if convergence is achieved before the maximum number of iterations.

normalization

The normalization method to use. Can be one of "TMM", "DESeq2" or "phyloseq". If none of the methods are of interest, global normalization factors can also be given as input in the normFactors argument. If "TMM", the trimmed mean of M-values (Robinson & Oshlack, 2010) normalization is used as implemented in edgeR. If "DESeq2", the default median-of-ratios method from the DESeq2 package (Love et al., 2014) is used for normalization. If "DESeq2_poscounts", an adapted median-of-ratios method now implemented in DESeq2. This method was originally first implemented in the phyloseq package (McMurdie & Holmes, 2013). The adaptation ensures that genes with zero counts can be used for the purpose of normalization.

colData

Only applicable if normalization="DESeq2" or normalization="phyloseq". The colData with pheno data for constructing a DESeqDataSet-class object.

designFormula

Only applicable if normalization="DESeq2" or normalization="phyloseq". The design formula required for constructing a DESeqDataSet-class object.

normFactors

A vector of user-supplied global normalization factors for every sample. The normalization factors should be sorted according to the samples in the count matrix.

plot

Logical. Should the BCV plot be plotted in every iteration?

plotW

Logical. Should the distribution of posterior probabilities for all zeros in the count matrix be plotted in every iteration?

designZI

The design for the zero-excess model. If NULL, the effective library size (defined as the sequencing depth multiplied by the normalization factors) is used by default.

llOffset

Offset added to likelihood to avoid taking the log of 0. Defaults to $1e-6$.

References

Robinson MD and Oshlack A (2010). "A scaling normalization method for differential expression analysis of RNA-seq data." Genome Biology, 11, pp. 25.

Love MI, Huber W and Anders S (2014). "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." Genome Biology, 15, pp. 550.

McMurdie PJ and Holmes S (2013). "phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data." PLoS ONE, 8(4), pp. e61217.

Examples

1
2
3
4
data(islamEset,package="zingeR")
islam=exprs(islamEset)[1:2000,]
design=model.matrix(~pData(islamEset)[,1])
zeroWeights=zeroWeightsLS(counts=islam, design=design, maxit=200)

statOmics/zingeR documentation built on May 20, 2019, 6:48 p.m.