Build fragment types from exons

Description

This function constructs a DataFrame of fragment features used for bias modeling, with one row for every potential fragment type that could arise from a transcript. The output of this function is used by fitBiasModels, and this function is used inside estimateAbundance in order to model the bias affecting different fragments across isoforms of a gene.

Usage

1
2
buildFragtypes(exons, genome, readlength, minsize, maxsize, gc = TRUE,
  gc.str = TRUE, vlmm = TRUE)

Arguments

exons

a GRanges object with the exons for a single transcript

genome

a BSgenome object

readlength

the length of the reads. This doesn't necessarily have to be exact (+/- 1 bp is acceptable)

minsize

the minimum fragment length to model. The interval between minsize and maxsize should contain the at least the central 95 percent of the fragment length distribution across samples

maxsize

the maximum fragment length to model

gc

logical, whether to calculate the fragment GC content

gc.str

logical, whether to look for presence of stretches of very high GC within fragments

vlmm

logical, whether to calculate the Cufflinks Variable Length Markov Model (VLMM) for read start bias

Value

a DataFrame with bias features (columns) for all potential fragments (rows)

Examples

1
2
3
4
5
6
7
8
9
library(GenomicRanges)
library(BSgenome.Hsapiens.NCBI.GRCh38)
data(preprocessedData)
readlength <- 100
minsize <- 125 # see vignette how to choose
maxsize <- 175 # see vignette how to choose
fragtypes <- buildFragtypes(ebt.fit[["ENST00000624447"]],
                            Hsapiens, readlength,
                            minsize, maxsize)