BiocStyle::markdown()
suppressPackageStartupMessages({ library(rtracklayer) library(BiocStyle) library(erma) library(GenomicFiles) library(ggplot2) library(grid) library(png) library(DT) })
The epigenomics road map describes locations of epigenetic marks in DNA from a variety of cell types. Of interest are locations of histone modifications, sites of DNA methylation, and regions of accessible chromatin.
This package presents a selection of elements of the road map including metadata and outputs of the ChromImpute procedure applied to ENCODE cell lines by Ernst and Kellis.
I have retrieved a Google Docs spreadsheet with comprehensive information. The mapmeta() function provides access to a local DataFrame image of the file as retrieved in mid April 2015. We provide a dynamic view of a selection of columns. Use the search box to filter records shown, for example \code{LUNG}.
library(DT) library(erma) meta = mapmeta() kpc = c("Comments", "Epigenome.ID..EID.", "Epigenome.Mnemonic", "Quality.Rating", "Standardized.Epigenome.name", "ANATOMY", "TYPE") datatable(as.data.frame(meta[,kpc]))
The chromatin states and standard colorings
used are enumerated in states_25
:
data(states_25) datatable(states_25)
The emission parameters of the 25 state model are depicted in the supplementary Figure 33 of Ernst and Kellis:
library(png) im = readPNG(system.file("pngs/emparms.png", package="erma")) grid.raster(im)
I have retrieved a modest number of roadmap bed files with ChromImpute
mnemonic labeling of chromatin by states. These can be
managed with an ErmaSet instance,
a trivial extension of r Biocpkg("GenomicFiles")
class.
The cellTypes
method yields a character vector. The colData
component has full metadata on the cell lines available.
ermaset = makeErmaSet() ermaset cellTypes(ermaset)[1:5] datatable(as.data.frame(colData(ermaset)[,kpc]))
We form a GRanges representing 50kb upstream of IL33.
uil33 = flank(resize(range(genemodel("IL33")), 1), width=50000) uil33
Bind this to the ErmaSet instance.
rowRanges(ermaset) = uil33 ermaset
Now query the files for cell-specific states in this interval.
library(BiocParallel) register(MulticoreParam(workers=2)) suppressWarnings({ csstates = lapply(reduceByFile(ermaset, MAP=function(range, file) { imp = import(file, which=range, genome=genome(range)[1]) seqlevels(imp) = seqlevels(range) imp$rgb = erma:::rgbByState(imp$name) imp }), "[[", 1) }) tys = cellTypes(ermaset) # need to label with cell types csstates = lapply(1:length(csstates), function(x) { csstates[[x]]$celltype = tys[x] csstates[[x]] }) csstates[1:2]
This sort of code underlies
the csProfile
utility to visualize variation in state assignments
in promoter regions for various genes.
csProfile(ermaset[,1:5], symbol="CD28", useShiny=FALSE)
Set useShiny
to TRUE
to permit interactive selection of
region to visualize.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.