encodeTP53: mapped RNA-seq data from ENCODE

Description Format References

Description

The data contains gene expression and transcript annotations in the region of the human TP53 gene (region (chr17:7,560,001-7,610,000 from the Human February 2009 (GRCh37/hg19) genome assembly), which is part of the long RNA-seq data generated by ENCODE/Cold Spring Harbor Lab, containing 2 cell types (GM12878 and K562) with 2 replicates each.

The alignment files were pulled from UCSC (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCshlLongRnaSeq/). And subsequently reads were counted in each non-overlapping 25bp window for the region (chr17:7,560,001-7,610,000). The example code to generate this count GRanges is available in the vignette.

The regional annotation of TP53 RNAs isoforms were derived from the ENCODE Gene Annotations (GENCODE), sub-setted to only isoforms of TP53 gene. http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencodeV4/wgEncodeGencodeManualV4.gtf.gz).

This dataset is used in the package vignette to illustrate a use case of transcript detection.

Format

Containing two GRanges objects, one for the sample count and one for the regional annotation of gene TP53

References

http://dx.doi.org/10.1371%2Fjournal.pbio.1001046 The ENCODE Project Consortium (2011) A User's Guide to the Encyclopedia of DNA Elements (ENCODE). PLoS Biol 9(4): e1001046. doi:10.1371/journal.pbio.1001046


biomvRCNS documentation built on Nov. 8, 2020, 6:49 p.m.