simuComp: Simulation for the four methods and compare the p value.

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Simulate a count table follows negative binomial or Poisson distribution. Implement limma voom, limma trans, intSEQ and Quasi likelihood method with simulated data.

Usage

1
2
3
4
5
6
7
8
9
simuComp1(mle, nsamp = 3, lfc, null = F, lambda1 = 2 * nsamp, lambda2 = 0.05, 
normalize = TRUE,  sdd =0.3631473,constadj, w1=max(1-2*nsamp/100, 0)
         ,w2=max(1-2*nsamp/1000, 0))

simuComp(intres,  nsamp =3, nullgroup = NULL, 
         ntime=100,null=F, lambda1=2*nsamp, lambda2=0.05, normalize=FALSE, 
         levels=c(1e-6,1e-5,1e-4,1e-3,0.01,0.05,0.1), fdrlevel=0.1 , 
         small=NULL, medium=NULL,large=NULL, sdd=0.3631473,constadj=FALSE, 
         w1=max(1-2*nsamp/100, 0), w2=max(1-2*nsamp/1000, 0))

Arguments

mle

A matrix of two columns. The means (1st column) and dispersions(2nd column) for the baseline group.

intres

An object returned by intSEQ.

nullgroup

A logical vector of length of the number of columns of count.data, indicating which objects are baseline group.

ntime

The number of simulations to be conducted.

fdrlevel

The fdr level.

nsamp

Number of samples per group

lfc

The log2 fold change used to determine the mean of second group.

null

Logical, whether the second group is equal to the first.

lambda1

The first parameter for the prior variance

lambda2

The second parameter for the prior variance

normalize

Whether to normalize the data.

levels

The p-value thresholds. Should be a numeric vector.

small

A logical vector of the length of number of genes, indicating which genes has low between group difference. Those genes with fold change (larger divided by small) smaller than 1.5 are selected without specifying.

medium

A logical vector of the length of number of genes, indicating which genes has medium between group difference. Those genes with fold change (larger divided by small) smaller than 2 and larger than 1.5 are selected without specifying.

large

A logical vector of the length of number of genes, indicating which genes has large between group difference. Those genes with fold change (larger divided by small) larger than 2 are selected without specifying.

sdd

A scalar vector. The normalizing factor is assumed to be distributed as exp(Norm(0, sdd)). The default value is estimated from the Montgomery data.

constadj

Whether a constant should be multiplied to adjust the underflow problem of joint distribution.

w1

See Details in intSEQ.

w2

See Details in intSEQ.

Details

With the result of intSEQ, simulate two groups of synthetic RNA-seq that has means and dispersion equals to estimated value of the count data passed to intSEQ. Then compare the performance of four methods: limma voom, limma trans, intSEQ and Quasi likelihood method in edgeR. If null is set to be true, the ability of controlling the false positive rate is examined. Otherwise, the for methods will be tested by their power performance for given levels or fdr threshold.

Value

null

logical, indicating whether this simulation is conducted under null condition

When null is FALSE:

dissmall

The discovery rate table for small difference genes.

dismedium

The discovery rate table for medium difference genes.

dislarge

The discovery rate table for large difference genes.

disall

The discovery rate table for all genes.

disc

The discovery rate given the fdr threshold.

levels

The p-value thresholds. Should be a numeric vector.

plist

A list stored p-values in the simulation.

fdr

A scalar indicating the fdr threshold.

diff

A data frame of logical indicators of the belongings of genes of three categories: small, medium and large

When null is FALSE:

table

The FPR table under levels for the four methods.

levels

The p-value thresholds. Should be a numeric vector.

plist

A list stored p-values in the simulation.

Author(s)

Yilun Zhang, David Rocke

References

our paper

Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and Smyth, G.K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47.

Lund S P, Nettleton D, McCarthy D J, et al. Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates[J]. Stat Appl Genet Mol Biol, 2012, 11(5): 8.

See Also

edgeR.QL.running, limma.trans.running, edgeR.QL.running, intSEQ

Examples

1
2
3
4
5
6
7
#select the first 10 columns of mont&pick data
data(count.data)
data(condition)
count=count.data[,1:10]
cond=rep(0:1,each=5)
res=intSEQ(count, cond)
simu.res <- simuComp(res, ntime = 2)

lunge111/intSEQ documentation built on May 20, 2019, 9:38 a.m.