simMAEcheck: Model checking for One Sample Problems.
In casper: Characterization of Alternative Splicing based on Paired-End Reads

Description Usage Arguments Details Value References See Also Examples

Simulates RNA-seq data under the same experimental setting as in the observed data, and compares the observed vector of number of reads per gene with the simulations.

1	simMAEcheck(nsim, islandid, burnin=1000, pc, distr, readLength.pilot, eset.pilot, usePilot=FALSE, retTxsError=FALSE, genomeDB, mc.cores=1, mc.cores.int=1, verbose=FALSE)

`nsim`	Number of RNA-seq datasets to generate (often as little as `nsim=10` suffice)
`islandid`	When specified this argument indicates to run the simulations only for gene islands with identifiers in `islandid`. When not specified genome-wide simulations are performed.
`burnin`	Number of MCMC burn-in samples (passed on to `calcExp`)
`pc`	Observed path counts in pilot data. When not specified, these are simulated from `eset.pilot`
`distr`	Estimated read start and insert size distributions in pilot data
`readLength.pilot`	Read length in pilot data
`eset.pilot`	ExpressionSet with pilot data expression in log2-RPKM, used to simulate `pc` when not specified by the user. See details
`usePilot`	By default `casper` assumes that the pilot data is from a related experiment rather than the current tissue of interest (`usePilot=FALSE`). Hence, the pilot data is used to simulate new RNA-seq data but not to estimate its expression. However, in some cases we may be interested in re-sequencing the pilot sample at deeper length, in which case one would want to combine the pilot data with the new data to obtain more precise estimates. This can be achieved by setting `usePilot=TRUE`
`retTxsError`	If `retTxsError=TRUE`, `simMAE` returns posterior expected MAE for each individual isoform. This option is not available when `eset.pilot` is specified instead of `pc`. Else the output is a `data.frame` with overall MAE across all isoforms
`genomeDB`	`annotatedGenome` object, as returned by `procGenome`
`mc.cores`	Number of cores to use in the expression estimation step, passed on to `calcExp`
`mc.cores.int`	Number of cores to simulate RNA-seq datasets in parallel
`verbose`	Set `verbose=TRUE` to print progress information

simMAEcheck simulates nsim datasets under the same experimental setting as in the observed data. For more details, please check the documentation for simMAE, which is the basis of this function.

The output is a list with 2 entries. The first entry is a data.frame with overall MAE across all isoforms in the simulations (see simMAE for details). The second entry contains the expected number of genes for which the number of reads in the data lies in the range of the posterior predictive simulations (under the hypothesis that they have the same distribution) and the actual number of genes for which the condition is satisfied.

Stephan-Otto Attolini C., Pena V., Rossell D. Bayesian designs for personalized alternative splicing RNA-seq studies (2014)

Li, W. and Freudenberg, J. and Miramontes, P. Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome. BMC Bioinformatics, 15, 2 (2014)

wrapKnown,simReads,calcExp