simulate_experiment_empirical: Simulate RNA-seq experiment based on abundances from a data...

Description Usage Arguments Value Examples

View source: R/simulate_experiment_empirical.R

Description

Create fasta files representing reads from a two-group experiment, where abundances and differential expression are estimated from a real data set

Usage

1
2
3
4
5
6
7
8
9
simulate_experiment_empirical(
  bg = NULL,
  fpkmMat = NULL,
  mean_rps = 5e+06,
  grouplabels = NULL,
  decut = 1.5,
  outdir = ".",
  ...
)

Arguments

bg

Ballgown object containing estimated transcript abundances in FPKM. Reads will be simulated for the same number of replicates that are in bg. Must provide exactly one of bg and fpkmMat.

fpkmMat

transcript-by-sample matrix containing abundances (in FPKM) estimated from a real data set. MUST have row names identifying transcripts. The number of columns is the number of samples that will be simulated.

mean_rps

Number of reads per sample to use in converting FPKM measurements to counts. Should be somewhat close to the number of reads per sample in the experiment that generated the estimated FPKMs. Defaults to 5 million (5e6).

grouplabels

vector indicating the group labels for each replicate in the experiment. Must be convertible to a factor with exactly 2 levels.

decut

A transcript will be recorded as truly differentially expressed if its fold change between the two groups is more extreme than decut, in either direction.

outdir

character, path to folder where simulated reads should be written, without a slash at the end of the folder name. By default, reads written to the working directory.

...

Additional arguments to pass to simulate_experiment_countmat

Value

No return, but reads are written to outdir.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 

  library(ballgown) 
  data(bg)
  bg = subset(bg, "chr=='22'")
  
  # load gtf file:
  gtfpath = system.file('extdata', 'bg.gtf.gz', package='polyester')
  gtf = subset(gffRead(gtfpath), seqname=='22')
 
  # load/download chromosome sequence (just for this example)
  system('wget https://www.dropbox.com/s/04i6msi9vu2snif/chr22seq.rda')
  load('chr22seq.rda')
  names(chr22seq) = '22'

  # simulate reads based on this experiment's FPKMs
  simulate_experiment_empirical(bg, grouplabels=pData(bg)$group, gtf=gtf, 
     seqpath=chr22seq, mean_rps=5000, outdir='simulated_reads_3', seed=1247)

## End(Not run)

alyssafrazee/polyester documentation built on Sept. 17, 2021, 8:54 a.m.