estimateNBParam: Estimate simulation parameters

Description Usage Arguments Value Author(s) Examples

Description

This function estimates and returns parameters needed for power simulations assuming a negative binomial read count distribution.

Usage

1
2
3
4
estimateNBParam(countData, cData=NULL, design=NULL,
RNAseq=c("bulk", "singlecell"),
estFramework=c('edgeR', 'DESeq2', 'MatchMoments'),
sigma=1.96)

Arguments

countData

is a count matrix. Rows correspond to genes, columns to samples.

cData

A data.frame with at least a single column. Rows of colData must correspond to columns of countData. Default is NULL, i.e. the dispersion estimation is 'blind' to sample information, see varianceStabilizingTransformation.

design

A formula which expresses how the counts for each gene depend on the variables in colData. Designs with multiple covariates and/or interactions are supported. Default is NULL, i.e. no design information is considered.

RNAseq

is a character value: "bulk" or "singlecell".

estFramework

is a character value: "edgeR", "DESeq2" and "MatchMoments". "edgeR" or "DESeq2" employs the edgeR or DESeq2 style mean and dispersion estimation, respectively. For details, please consult estimateDisp and estimateDispersions. "MatchMoments" employs moments matching technique for of mean, dispersion and dropout estimation.

sigma

The variability band width for mean-dispersion loess fit defining the prediction interval for read cound simulation. Default is 1.96, i.e. 95% interval. For more information see loess.sd.

Value

List with the following vectors

seqDepth

Library size, i.e. total number of reads per library

means

Mean normalized read counts per gene.

dispersion

Dispersion estimate per gene.

common.dispersion

The common dispersion estimate over all genes.

size

Size parameter of the negative binomial distribution, i.e. 1/dispersion.

p0

Probability that the count will be zero per gene.

meansizefit

A loess fit relating log2 mean to log2 size for use in simulating new data (loess.sd).

meandispfit

A fit relating log2 mean to log2 dispersion used for visualizing mean-variance dependency (loess.sd).

p0.cut

The knee point of meanp0fit. Log2 mean values above that value have virtually no dropouts.

grand.dropout

The percentage of empty entries in the count matrix.

sf

The estimated library size factor per sample.

totalS,totalG

Number of samples and genes provided.

estS,estG

Number of samples and genes for which parameters can be estimated.

RNAseq

The type of RNAseq: bulk or single cell.

estFramework

The estimation framework for NB parameters.

sigma

The width of the variability band.

Author(s)

Beate

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Not run: 
## simulating single cell RNA-seq experiment
ngenes <- 10000
ncells <- 100
true.means <- 2^runif(ngenes, 3, 6)
true.dispersions <- 3 + 100/true.means
sf.values <- 2^rnorm(ncells, sd=0.5)
sf.means <- outer(true.means, sf.values, '*')
cnts <- matrix(rnbinom(ngenes*ncells,
mu=sf.means, size=1/true.dispersions),
ncol=ncells)
## estimating negative binomial parameters
estparam <- estimateNBParam(cnts, RNAseq = 'singlecell',
estFramework = 'MatchMoments', sigma=1.96)
plotNBParam(estparam)

## simulating bulk RNA-seq experiment
ngenes <- 10000
nsamples <- 10
true.means <- 2^rnorm(ngenes, mean=8, sd=2)
true.dispersions <- rgamma(ngenes, 2, 6)
sf.values <- rnorm(nsamples, mean=1, sd=0.1)
sf.means <- outer(true.means, sf.values, '*')
cnts <- matrix(rnbinom(ngenes*nsamples,
mu=sf.means, size=1/true.dispersions),
ncol=nsamples)
## estimating negative binomial parameters
estparam <- estimateNBParam(cnts, RNAseq = 'bulk',
estFramework = 'MatchMoments', sigma=1.96)
plotNBParam(estparam)

## End(Not run)

bvieth/powsim documentation built on May 13, 2019, 9:04 a.m.