README.md

Bambu

bambu: reference-guided transcript discovery and quantification for long read RNA-Seq data

GitHub release (latest by date) Maintained? Install License: GPL v3 DOI

bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.

Content

Installation

You can install bambu from github:

if (!requireNamespace("devtools", quietly = TRUE))
    install.packages("devtools")
devtools::install_github("GoekeLab/bambu")

General Usage

The default mode to run bambu* is using a set of aligned reads (bam files), reference genome annotations (gtf file, TxDb object, or bambuAnnotation object), and reference genome sequence (fasta file or BSgenome). bambu** will return a summarizedExperiment object with the genomic coordinates for annotated and new transcripts and transcript expression estimates.

We highly recommend to use the same annotations that were used for genome alignment. If you have a gtf file and fasta file you can run bambu with the following options:

test.bam <- system.file("extdata", "SGNex_A549_directRNA_replicate5_run1_chr9_1_1000000.bam", package = "bambu")

fa.file <- system.file("extdata", "Homo_sapiens.GRCh38.dna_sm.primary_assembly_chr9_1_1000000.fa", package = "bambu")

gtf.file <- system.file("extdata", "Homo_sapiens.GRCh38.91_chr9_1_1000000.gtf", package = "bambu")

bambuAnnotations <- prepareAnnotations(gtf.file)

se <- bambu(reads = test.bam, annotations = bambuAnnotations, genome = fa.file)

Quantification of annotated transcripts and genes only (no transcript/gene discovery)

bambu(reads = test.bam, annotations = txdb, genome = fa.file, discovery = FALSE)

Large sample number/ limited memory For larger sample numbers we recommend to write the processed data to a file:

bambu(reads = test.bam, rcOutDir = "./bambu/", annotations = bambuAnnotations, genome = fa.file)

Use precalculated annotation objects

You can also use precalculated annotations.

If you plan to run bambu more frequently, we recommend to save the bambuAnnotations object.

The bambuAnnotation object can be calculated from a .gtf file:

annotations <- prepareAnnotation(gtf.file)

From TxDb object

annotations <- prepareAnnotations(txdb)

Advanced Options

More stringent filtering thresholds imposed on potential novel transcripts

bambu(reads, annotations, genome, opt.discovery = list(min.readCount = 5))
bambu(reads, annotations, genome, opt.discovery = list(min.sampleNumber = 5))
bambu(reads, annotations, genome, opt.discovery = list(min.readFractionByGene = 0.1))

Quantification without bias correction

The default estimation automatically does bias correction for expression estimates. However, you can choose to perform the quantification without bias correction.

bambu(reads, annotations, genome, opt.em = list(bias = FALSE))

Parallel computation bambu allows parallel computation.

bambu(reads, annotations, genome, ncore = 8)

See manual for details to customize other conditions.

Complementary functions

Transcript expression to gene expression

transcriptToGeneExpression(se)

Visualization

You can visualize the novel genes/transcripts using plotBambu function

plotBambu(se, type = "annotation", gene_id)

plotBambu(se, type = "annotation", transcript_id)
plotBambu(se, type = "heatmap") # heatmap 

plotBambu(se, type = "pca") # PCA visualization
plotBambu(se, type = "heatmap", group.var) # heatmap 

plotBambu(se, type = "pca", group.var) # PCA visualization

Write bambu outputs to files

writeBambuOutput(se, path = "./bambu/")

Release History

bambu version 0.3.0

Release date: 28th July 2020

bambu version 0.2.0

Release date: 18th June 2020

bambu version 0.1.0

Release date: 29th May 2020

Citation

A manuscript describing bambu is currently in preparation. If you use bambu for your research, please cite using the following doi: 10.5281/zenodo.3900025.

Contributors

This package is developed and maintained by Ying Chen, Yuk Kei Wan, and Jonathan Goeke at the Genome Institute of Singapore. If you want to contribute, please leave an issue. Thank you.

Bambu



Try the bambu package in your browser

Any scripts or data that you put into this service are public.

bambu documentation built on Nov. 12, 2020, 2:01 a.m.