Description Usage Arguments Details Value Examples
create FASTA files containing RNA-seq reads simulated from provided transcripts, with optional differential expression between two groups
1 2 3 4 5 |
fasta |
path to FASTA file containing transcripts from which to simulate reads. See details. |
gtf |
path to GTF file containing transcript structures from which reads should be simulated. See details. |
seqpath |
path to folder containing one FASTA file ( |
num_reps |
How many biological replicates should be in each group? If
|
fraglen |
Mean RNA fragment length. Sequences will be read off the end(s) of these fragments. |
fragsd |
Standard deviation of fragment lengths. |
readlen |
Read length. |
error_rate |
Sequencing error rate. Must be between 0 and 1. A uniform error model is assumed. |
paired |
If |
reads_per_transcript |
baseline mean number of reads to simulate
from each transcript. Can be an integer, in which case this many reads
are simulated from each transcript, or an integer vector whose length
matches the number of transcripts in |
fold_changes |
Vector of multiplicative fold changes between groups,
one entry per transcript in |
size |
the negative binomial |
outdir |
character, path to folder where simulated reads should be written, with *no* slash at the end. By default, reads are written to current working directory. |
write_info |
If |
transcriptid |
optional vector of transcript IDs to be written into
|
seed |
Optional seed to set before simulating reads, for reproducibility. |
... |
additional arguments to pass to |
Reads can either be simulated from a FASTA file of transcripts
(provided with the fasta
argument) or from a GTF file plus DNA
sequences (provided with the gtf
and seqpath
arguments).
Simulating from a GTF file and DNA sequences may be a bit slower: it took
about 6 minutes to parse the GTF/sequence files for chromosomes 1-22, X,
and Y in hg19.
No return, but simulated reads and a simulation info file are written
to outdir
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ## simulate a few reads from chromosome 22
fastapath = system.file("extdata", "chr22.fa", package="polyester")
numtx = count_transcripts(fastapath)
set.seed(4)
fold_changes = sample(c(0.5, 1, 2), size=numtx,
prob=c(0.05, 0.9, 0.05), replace=TRUE)
library(Biostrings)
# remove quotes from transcript IDs:
tNames = gsub("'", "", names(readDNAStringSet(fastapath)))
simulate_experiment(fastapath, reads_per_transcript=10,
fold_changes=fold_changes, outdir='simulated_reads',
transcriptid=tNames, seed=12)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.