heatmapData: Based on a list of GRanges, determine various kind of counts...

Description Usage Arguments Details Value References Examples

View source: R/heatmapData.R

Description

Based on a list of GRanges, determine various kind of counts before displaying a heatmap

Usage

1
2
heatmapData(grl, refgr=grl[[1]], useScore=rep(FALSE,
length(grl)), type, Nnorm=TRUE, Snorm=TRUE, txdb=NULL, nbins=5)

Arguments

grl

list; a list of GRanges or paths to BAM files

refgr

GRanges; the reference set of genomic regions

useScore

logical; an optional boolean array of length equal to the length of grl

type

character; an array of length equal to the length of grl, with a combination of 'mcols', 'gr' or 'cov'

Nnorm

logical; whether to perform library size normalization, only applied if some of the element in type are equal to 'cov'

Snorm

logical; whether to perform normalization based on the refgr widths, only applied if some of the element in type are equal to 'cov'

nbins

numeric; the number of bins the ranges in refgr have to be divided into

txdb

an object of class TxDb

Details

The functions is used to determine various kind of counts for each object in grl in each range of refgr and is typically used to prepare the input for the heatmapPlot method.

The type of counts is determined through the corresponding type setting. If type is mcols, the counts are expected to be pre-calculated and available in the mcols of the correponding grl GRanges. If type is gr, the corresponding grl GRanges (gr) is considered and the counts are the number of occurrencies of gr for each ranges of refgr; if nbins is greater than 1 and type is gr, the counts are determined for each bin of each range of refgr. A score (the lower, the more significant) can be provided in the first column of the mcols of gr; the minimum score over the gr ranges associated to every given refgr range is determined and stored in the corresponding column of the scoreMat output matrix.

If type is cov, the corresponding grl has to be a path to a BAM file, and the counts are the coverage within each range of refgr; if nbins is greater than 1 and type is cov, the counts are determined for each bin of each range of refgr. If Nnorm is TRUE and type is cov, the counts are divided by the million mapped reads in the BAM file. If Snorm is TRUE and type is cov, the counts are divided by the range width in bp.

If a TxDb is provided, the presence of an intron or exon is registered for each range of refgr; intron is assigned 0.6, exon 0.4, and they will be rendered using the heatmapPlot function as red and pink, respectively. If nbins is greater than 1 and a TxDb is provided, the presence of an intron or exon is registered for each bin of each range of refgr.

The bam files have to be associated to the corresponding index .bai files. Please refer to the documentation of samtools on how to create them.

Value

A list of two items, matList and scoreMat is returned. matList: if a TxDb is not provided, matList is a list of length equal to the length of grl; each item of the list is a matrix with number of rows equal to the number of ranges in refgr, and number of columns equal to nbins; if a TxDb is provided, matList is a list of length equal to the length of grl + 2 is returned; the two extra items contain the count for introns and exons. scoreMat: if useScore is all FALSE then scoreMat is set to NULL, otherwise it is a matrix whose number of rows is equal to the length of refgr and the number of columns is equal to the length of grl; row Nr and column Nc contain the minimum score of mcols(grl[[Nc]])[,1] for the ranges overlapping with refgr[Nr], if any (0 otherwise).

References

http://genomics.iit.it/groups/computational-epigenomics.html

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
require(TxDb.Mmusculus.UCSC.mm9.knownGene)
txdb <- TxDb.Mmusculus.UCSC.mm9.knownGene
isActiveSeq(txdb) <- c(TRUE, rep(FALSE, length(isActiveSeq(txdb)) - 1))
TSSpos <- TSS(txdb)
gr <- TSSpos[1:5]
start(gr) <- start(gr) - 1000
end(gr) <- end(gr) - 600
extgr <- GRanges(seqnames(gr), ranges=IRanges(start(gr) - 1000, end(gr) + 1000))
data <- heatmapData(grl=list(ChIPseq= gr), refgr=extgr, type='gr')
restoreSeqlevels(txdb)

compEpiTools documentation built on Nov. 8, 2020, 5:32 p.m.