Overview

This package provides an interface to estimate transcript abundances of any samples quantified by the aligner Rail-RNA. This method is a non-negative least squares (NNLS) estimation that infers the number of reads that originated from each transcript of the coding portion of the GencodeV25 transcriptome. The model does not require raw aligned BAM files, but is content with compressed coverage statistics (coverage of the genome at a basepair level and across annotated junctions) primarily stored in bigwig formats.

The more than 70,000 samples compiled by recount2 already have transcript expression pre-computed by this method and is directly accessible. To replicate the abundance estimation of any SRA project in recount2, the user only needs to supply the SRA project id. Otherwise, users can supply the necessary information as outlined further below to utilize this package to carry out transcript abundance estimation.

Accessing quantified estimates for samples in recount2

For the projects currently curated in recount2, to access quantified transcript abundances use:

project = 'DRP000366'
path = getRseTx(project, download_path=paste0(getwd(), '/rse_tx.RData'))
load(path)

Quantifying samples on recount2

To re-run the NNLS model on the samples in recount2 follow:

library(recountNNLS)

## Specify a SRA project and download the relevant path data from recount2
project = 'SRP063581'
pheno = processPheno(project)

## Main NNLS workhorse function to create a RSE of transcript abundance
rse_tx = recountNNLS(pheno)

Data not yet part of recount2

Please see vignette vignette('recountNNLS', package='recountNNLS').

Deliverable

The output of recountNNLS() is a RangedSummarizedExperiment where:



JMF47/recountNNLS documentation built on May 28, 2019, 12:42 p.m.