README.md

Stay proActiv!

Stay proActiv!

proActiv: Estimation of Promoter Activity from RNA-Seq data

GitHub release (latest by
date) Maintained? Install

proActiv is an R package that estimates promoter activity from RNA-Seq data. proActiv uses aligned reads and genome annotations as input, and provides absolute and relative promoter activity as output. The package can be used to identify active promoters and alternative promoters. Details of the method are described in Demircioglu et al.

HTML documentation of proActiv, including a complete step-by-step workflow and a function manual, is available at https://goekelab.github.io/proActiv/.

Additional data on differential promoters in tissues and cancers from TCGA, ICGC, GTEx, and PCAWG is available at https://jglab.org/data-and-software/.

Content

Installation

proActiv can be installed from GitHub with:

library("devtools")
devtools::install_github("GoekeLab/proActiv")

Quick Start

proActiv estimates promoter activity from RNA-Seq data. Promoter activity is defined as the total amount of transcription initiated at each promoter. proActiv takes as input either BAM files or junction files (TopHat2 or STAR), and a promoter annotation object of the relevant genome. An optional argument condition can be supplied, describing the experimental condition corresponding to each input file. Here we demonstrate proActiv with STAR junction files (Human genome GRCh38 GENCODE v34) as input. These files are taken from the SGNEx project but restricted to the chr1:10,000,000-30,000,000 region, and can be found at ‘extdata/vignette’:

library(proActiv)

## List of STAR junction files as input
files <- list.files(system.file('extdata/vignette/junctions', 
                                package = 'proActiv'), full.names = TRUE)
## Vector describing experimental condition
condition <- rep(c('A549','HepG2'), each=3)
## Promoter annotation for human genome GENCODE v34
promoterAnnotation <- promoterAnnotation.gencode.v34.subset

result <- proActiv(files = files, 
                   promoterAnnotation = promoterAnnotation,
                   condition = condition)

result is a summarizedExperiment object which can be accessed as follows:

proActiv can also be run with BAM files as input, but an additional parameter genome must be supplied:

## From BAM files - genome parameter must be provided
files <- list.files(system.file('extdata/testdata/bam', package = 'proActiv'), full.names = TRUE)
result <- proActiv(files = files, 
                   promoterAnnotation = promoterAnnotation.gencode.v34.subset,
                   genome = 'hg38')

Creating a Promoter Annotation object

In order to quantify promoter activity, proActiv uses a set of promoters based on genome annotations. proActiv allows the creation of a promoter annotation object for any genome from a TxDb object or from a GTF file with the preparePromoterAnnotation function. Users have the option to either pass the file path of the GTF/GFF or TxDb to be used, or use the TxDb object directly as input. proActiv includes pre-calculated promoter annotations for the human genome (GENCODE v34). However, due to size constraints, the annotation is restricted to the chr1:10,000,000-30,000,000 region. Users can build full annotations by downloading GTF files from GENCODE page and following the steps below.

Here, we demonstrate creating the subsetted promoter annotation for the Human genome (GENCODE v34) with both GTF and TxDb:

## From GTF file path
gtf.file <- system.file('extdata/vignette/annotation/gencode.v34.annotation.subset.gtf.gz', 
                        package = 'proActiv')
promoterAnnotation.gencode.v34.subset <- preparePromoterAnnotation(file = gtf.file,
                                                                   species = 'Homo_sapiens')
## From TxDb object
txdb.file <- system.file('extdata/vignette/annotation/gencode.v34.annotation.subset.sqlite', 
                         package = 'proActiv')
txdb <- loadDb(txdb.file)
promoterAnnotation.gencode.v34.subset <- preparePromoterAnnotation(txdb = txdb, 
                                                                   species = 'Homo_sapiens')

The PromoterAnnotation object has 3 slots:

Complete Analysis Workflow: Analyzing Alternative Promoters

Most human genes have multiple promoters that control the expression of distinct isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Importantly, alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv includes a workflow to identify and visualize alternative promoter usage between conditions. This workflow is described in detail here.

Release History

Release 0.99.0

Release date: 21st August 2020

Changes in version 0.99.0:

Initial Release 0.1.0

Release date: 19th May 2020

This release corresponds to the proActiv version used by Demircioglu et al.

Limitations

proActiv will not provide promoter activity estimates for promoters which are not uniquely identifiable from splice junctions (single exon transcripts, promoters which overlap with internal exons).

Reference

If you use proActiv, please cite:

Demircioğlu, Deniz, et al. “A Pan-cancer Transcriptome Analysis Reveals Pervasive Regulation through Alternative Promoters.” Cell 178.6 (2019): 1465-1477.

Contributors

proActiv is developed and maintained by Deniz Demircioglu, Joseph Lee, and Jonathan Göke.

Stay proActiv!



Try the proActiv package in your browser

Any scripts or data that you put into this service are public.

proActiv documentation built on Nov. 8, 2020, 8:14 p.m.