combinatorial_binding_matrix: Generate the combinatorial peak association matrix with...

Description Usage Arguments Value Examples

View source: R/p04_combinatorial_binding_matrix.R

Description

This method reads all the narrowPeak files for given TFs and create a merged peak region list. All the original peaks are then searched for overlap against this master list and a combinatorial dataframe showing the presence of peak in the master peak list is returned. If a region has more than one peak overlapping, best peak is selected using peakPval column. This is done to avoid the exponential growth in number of rows with increasing samples. Additionally, it also extracts the sequence around summit position

Usage

1
2
3
4
5
6
7
8
9
combinatorial_binding_matrix(
  sampleInfo,
  peakRegions = NULL,
  peakFormat = "narrowPeak",
  summitRegion = 0,
  peakCols = c("peakId", "peakEnrichment", "peakPval"),
  genome = NULL,
  summitSeqLen = 200
)

Arguments

sampleInfo

Sample information dataframe

peakRegions

Optional GRanges object which has master peak regions. If not provided, peaks from narrowPeak files are merged to create new master peakset.

peakFormat

Format of the peak file. One of "narrowPeak", "broadPeak", "bed"

summitRegion

Region width around peak summit to use for while merging the peaks from multiple samples. With increasing number of peaksets, the mearging of peaks creates broader consensus peaksets. Using a small region around peak summit allows to limit this consensus peak width. If 0 (default), whole peak region is used. If summitRegion > 0, 2 x summitRegion region around peak summit is used to create consensus peakset.

peakCols

Column to extract from peak file. Column names should be from this list: c("peakChr", "peakStart", "peakEnd", "peakId", "peakScore", "peakStrand", "peakEnrichment", "peakPval", "peakQval", "peakSummit"). Default: c("peakId", "peakEnrichment", "peakPval")

genome

Optionally BSgenome object for extracting summit sequence

summitSeqLen

Length of sequence to extract at summit position. Default: 200

Value

A dataframe with a masterDf of peak regions generated after merging all peak regions from all samples. For each sample, its association with regions in the masterDf is reported.

Examples

1

lakhanp1/chipmine documentation built on March 6, 2021, 9:06 a.m.