Description Usage Arguments Details Value References See Also Examples
View source: R/generate_fragments.R
Convert each sequence in a DNAStringSet to a "fragment" (subsequence)
1 2 3 4 5 6 7 8 9 10 | generate_fragments(
tObj,
fraglen = 250,
fragsd = 25,
readlen = 100,
distr = "normal",
custdens = NULL,
bias = "none",
frag_GC_bias = "none"
)
|
tObj |
DNAStringSet of sequences from which fragments should be extracted |
fraglen |
Mean fragment length, if drawing fragment lengths from a normal distribution. |
fragsd |
Standard deviation of fragment lengths, if drawing lengths
from a normal distribution. Note: |
readlen |
Read length. Default 100. Used only to label read positions. |
distr |
One of 'normal', 'empirical', or 'custom'. If 'normal', draw
fragment lengths from a normal distribution with mean |
custdens |
If |
bias |
One of 'none', 'rnaf', or 'cdnaf' (default 'none'). 'none' represents uniform fragment selection (every possible fragment in a transcript has equal probability of being in the experiment); 'rnaf' represents positional bias that arises in protocols using RNA fragmentation, and 'cdnaf' represents positional bias arising in protocols that use cDNA fragmentation (Li and Jiang 2012). Using the 'rnaf' model, coverage is higher in the middle of the transcript and lower at both ends, and in the 'cdnaf' model, coverage increases toward the 3' end of the transcript. The probability models used come from Supplementary Figure S3 of Li and Jiang (2012). |
frag_GC_bias |
See explanation in |
The empirical fragment length distribution was estimated using 7 randomly selected RNA-seq samples from the GEUVADIS dataset ('t Hoen et al 2013), one sample from each laboratory that performed sequencing for that data set. We used Picard's "CollectInsertSizeMetrics" (http://broadinstitute.github.io/picard/), version 1.121, to estimate the insert size distribution based on the read alignments.
DNAStringSet consisting of one randomly selected subsequence per
element of tObj
.
't Hoen PA, et al (2013): Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nature Biotechnology 31(11): 1015-1022.
Li W and Jiang T (2012): Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads. Bioinformatics 28(22): 2914-2921.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | library(Biostrings)
data(srPhiX174)
## get fragments with lengths drawn from normal distrubution
set.seed(174)
srPhiX174_fragments = generate_fragments(srPhiX174, fraglen=15, fragsd=3,
readlen=4)
srPhiX174_fragments
srPhiX174
## get fragments with lengths drawn from an empirical distribution
empirical_frags = generate_fragments(srPhiX174, distr='empirical')
empirical_frags
## get fragments with lengths from a normal distribution, but include
## positional bias from cDNA fragmentation:
biased_frags = generate_fragments(srPhiX174, bias='cdnaf')
biased_frags
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.