bambu: long read isoform reconstruction and quantification

Description Usage Arguments Details Value Examples

View source: R/bambu.R

Description

This function takes bam file of genomic alignments and performs isoform recontruction and gene and transcript expression quantification. It also allows saving of read class files of alignments, extending provided annotations, and quantification based on extended annotations. When multiple samples are provided, extended annotations will be combined across samples to allow comparison.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
bambu(
    reads = NULL,
    rcFile = NULL,
    rcOutDir = NULL,
    annotations = NULL,
    genome = NULL,
    stranded = FALSE,
    ncore = 1,
    yieldSize = NULL,
    opt.discovery = NULL,
    opt.em = NULL,
    discovery = TRUE,
    verbose = FALSE
)

Arguments

reads

A string or a vector of strings specifying the paths of bam files for genomic alignments, or a BamFile object or a BamFileList object (see Rsamtools).

rcFile

A string or a vector of strings specifying the read class files that are saved during previous run of bambu.

rcOutDir

A string variable specifying the path to where read class files will be saved.

annotations

A TxDb object or A GRangesList object obtained by prepareAnnotations.

genome

A fasta file or a BSGenome object.

stranded

A boolean for strandedness, defaults to FALSE.

ncore

specifying number of cores used when parallel processing is used, defaults to 1.

yieldSize

see Rsamtools.

opt.discovery

A list of controlling parameters for isoform reconstruction process:

  • prefix specifying prefix for new gene Ids (genePrefix.number), defaults to empty

  • remove.subsetTx indicating whether filter to remove read classes which are a subset of known transcripts(), defaults to TRUE

  • min.readCount specifying minimun read count to consider a read class valid in a sample, defaults to 2

  • min.readFractionByGene specifying minimum relative read count per gene, highly expressed genes will have many high read count low relative abundance transcripts that can be filtered, defaults to 0.05

  • min.sampleNumber specifying minimum sample number with minimum read count, defaults to 1

  • min.exonDistance specifying minum distance to known transcript to be considered valid as new, defaults to 35

  • min.exonOverlap specifying minimum number of bases shared with annotation to be assigned to the same gene id, defaults 10 base pairs

opt.em

A list of controlling parameters for quantification algorithm estimation process:

  • maxiter specifying maximum number of run interations, defaults to 10000.

  • bias specifying whether to correct for bias, defaults to FALSE.

  • conv specifying the covergence trheshold control, defaults to 0.0001.

discovery

A logical variable indicating whether annotations are to be extended for quantification.

verbose

A logical variable indicating whether processing messages will be printed.

Details

Main function

Value

A list of two SummarizedExperiment object for transcript expression and gene expression.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## =====================
test.bam <- system.file("extdata",
    "SGNex_A549_directRNA_replicate5_run1_chr9_1_1000000.bam",
    package = "bambu")
fa.file <- system.file("extdata", 
    "Homo_sapiens.GRCh38.dna_sm.primary_assembly_chr9_1_1000000.fa", 
    package = "bambu")
gr <- readRDS(system.file("extdata", 
    "annotationGranges_txdbGrch38_91_chr9_1_1000000.rds",
    package = "bambu"))
se <- bambu(reads = test.bam, annotations = gr, 
    genome = fa.file,  discovery = FALSE)

bambu documentation built on Nov. 12, 2020, 2:01 a.m.