combinatorial_binding_matrix: Generate the combinatorial peak association matrix with...
In lakhanp1/chipmine: Analyze ChIP-seq Data

Description Usage Arguments Value Examples

View source: R/p04_combinatorial_binding_matrix.R

This method reads all the narrowPeak files for given TFs and create a merged peak region list. All the original peaks are then searched for overlap against this master list and a combinatorial dataframe showing the presence of peak in the master peak list is returned. If a region has more than one peak overlapping, best peak is selected using peakPval column. This is done to avoid the exponential growth in number of rows with increasing samples. Additionally, it also extracts the sequence around summit position

combinatorial_binding_matrix(
  sampleInfo,
  peakRegions = NULL,
  peakFormat = "narrowPeak",
  summitRegion = 0,
  peakCols = c("peakId", "peakEnrichment", "peakPval"),
  genome = NULL,
  summitSeqLen = 200
)

`sampleInfo`	Sample information dataframe
`peakRegions`	Optional GRanges object which has master peak regions. If not provided, peaks from narrowPeak files are merged to create new master peakset.
`peakFormat`	Format of the peak file. One of `"narrowPeak", "broadPeak", "bed"`
`summitRegion`	Region width around peak summit to use for while merging the peaks from multiple samples. With increasing number of peaksets, the mearging of peaks creates broader consensus peaksets. Using a small region around peak summit allows to limit this consensus peak width. If 0 (default), whole peak region is used. If `summitRegion > 0`, `2 x summitRegion` region around peak summit is used to create consensus peakset.
`peakCols`	Column to extract from peak file. Column names should be from this list: `c("peakChr", "peakStart", "peakEnd", "peakId", "peakScore", "peakStrand", "peakEnrichment", "peakPval", "peakQval", "peakSummit").` Default: `c("peakId", "peakEnrichment", "peakPval")`
`genome`	Optionally BSgenome object for extracting summit sequence
`summitSeqLen`	Length of sequence to extract at summit position. Default: 200

A dataframe with a masterDf of peak regions generated after merging all peak regions from all samples. For each sample, its association with regions in the masterDf is reported.