SimSeq: Nonparametric Simulation of RNA-Seq Data

RNA sequencing analysis methods are often derived by relying on hypothetical parametric models for read counts that are not likely to be precisely satisfied in practice. Methods are often tested by analyzing data that have been simulated according to the assumed model. This testing strategy can result in an overly optimistic view of the performance of an RNA-seq analysis method. We develop a data-based simulation algorithm for RNA-seq data. The vector of read counts simulated for a given experimental unit has a joint distribution that closely matches the distribution of a source RNA-seq dataset provided by the user. Users control the proportion of genes simulated to be differentially expressed (DE) and can provide a vector of weights to control the distribution of effect sizes. The algorithm requires a matrix of RNA-seq read counts with large sample sizes in at least two treatment groups. Many datasets are available that fit this standard.

Author
Samuel Benidt
Date of publication
2015-11-23 12:33:58
Maintainer
Samuel Benidt <sgbenidt@gmail.com>
License
GPL (>= 2)
Version
1.4.0

View on CRAN

Man pages

CalcPvalWilcox
Calculate P-values of Differential Expression
kidney
Kidney Renal Clear Cell Carcinoma [KIRC] RNA-Seq data
SimData
SimData
SimSeq-package
Nonparametric Simulation of RNA-Seq Data
SortData
SortData

Files in this package

SimSeq
SimSeq/inst
SimSeq/inst/CITATION
SimSeq/NAMESPACE
SimSeq/demo
SimSeq/demo/SimSeq.R
SimSeq/demo/00Index
SimSeq/NEWS
SimSeq/data
SimSeq/data/datalist
SimSeq/data/kidney.rda
SimSeq/R
SimSeq/R/CalcPvalWilcox.R
SimSeq/R/SortData.R
SimSeq/R/SimData.R
SimSeq/MD5
SimSeq/DESCRIPTION
SimSeq/man
SimSeq/man/SimData.Rd
SimSeq/man/SimSeq-package.Rd
SimSeq/man/SortData.Rd
SimSeq/man/CalcPvalWilcox.Rd
SimSeq/man/kidney.Rd