fitAlpineBiasModel: Fit fragment bias model with alpine
In csoneson/jcc: Calculate JCC Scores

View source: R/fitAlpineBiasModel.R

fitAlpineBiasModel

R Documentation

Fit fragment bias model with alpine

Description

This function provides a wrapper around some of the functions from the alpine package. Given a gtf file and a bam file with reads aligned to the genome, it will find single-isoform genes (with lengths and expression levels within given ranges) and use the observed read coverages to fit a fragment bias model.

Usage

fitAlpineBiasModel(gtf, bam, organism, genome, genomeVersion, version,
  minLength = 600, maxLength = 7000, minCount = 500, maxCount = 10000,
  subsample = TRUE, nbrSubsample = 200, seed = 1, minSize = NULL,
  maxSize = NULL, verbose = FALSE)

Arguments

`gtf`	Path to gtf file with genomic features. Preferably in Ensembl format.
`bam`	Path to bam file with read alignments to the genome.
`organism`	The organism (e.g., 'Homo_sapiens'). This argument will be passed to `ensembldb::ensDbFromGtf`.
`genome`	A `BSgenome` object.
`genomeVersion`	Genome version (e.g., 'GRCh38'). This argument will be passed to `ensembldb::ensDbFromGtf`.
`version`	The version of the reference annotation (e.g., 90). This argument will be passed to `ensembldb::ensDbFromGtf`.
`minLength`, `maxLength`	Minimum and maximum length of single-isoform genes used to fit fragment bias model.
`minCount`, `maxCount`	Minimum and maximum read coverage of single-isoform genes used to fit fragment bias model.
`subsample`	Whether to subsample the set of single-isoform genes satisfying the `minLength`, `maxLength`, `minCount` and `maxCount` criteria before fitting the fragment bias model.
`nbrSubsample`	If `subsample` is `TRUE`, the number of genes to subsample.
`seed`	If `subsample` is `TRUE`, the random seed to use to ensure reproducibility.
`minSize`, `maxSize`	Smallest and largest fragment size to consider. One or both of these can be `NULL`, in which case it is estimated as the 2.5 or 97.5 percentile, respectively, of estimated fragment sizes in the provided data.
`verbose`	Logical, whether to print progress messages.

Value

A list with three elements:

biasModel:: The fitted fragment bias model.
exonsByTx:: A GRangesList object with exons grouped by transcript.
transcripts:: A GRanges object with all the reference transcripts.

Author(s)

Charlotte Soneson, Michael I Love

References

Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD: A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. bioRxiv doi:10.1101/378539 (2018).

Love MI, Hogenesch JB, Irizarry RA: Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nature Biotechnology 34(12):1287-1291 (2016).

Examples

## Not run: 
gtf <- system.file("extdata/Homo_sapiens.GRCh38.90.chr22.gtf.gz",
                   package = "jcc")
bam <- system.file("extdata/reads.chr22.bam", package = "jcc")
biasMod <- fitAlpineBiasModel(gtf = gtf, bam = bam,
                              organism = "Homo_sapiens",
                              genome = Hsapiens, genomeVersion = "GRCh38",
                              version = 90, minLength = 230,
                              maxLength = 7000, minCount = 10,
                              maxCount = 10000, subsample = TRUE,
                              nbrSubsample = 30, seed = 1, minSize = NULL,
                              maxSize = 220, verbose = TRUE)

## End(Not run)

csoneson/jcc documentation built on Dec. 25, 2024, 2:32 a.m.