callPeaks: callPeaks

View source: R/callPeaks.R

callPeaksR Documentation

callPeaks

Description

This is a R-based implementation of the general MACS2 strategy (Zhang et al., Genome Biology 2008), taking some freedom here and there in comparison to the original. The function is still under development, especially with respect to single-end reads, where some optimization might still be needed. For paired-end reads, the results are nearly identical with those of MACS2, with two main differences: 1) the p-values are more conservative (and arguably better calibrated) and 2) because the implementation does not rely on sliding windows, with default settings the peaks are narrower.

Usage

callPeaks(
  bam,
  ctrl = NULL,
  paired = FALSE,
  type = c("narrow", "broad"),
  nullH = c("local", "global.nb", "global.bin"),
  blacklist = NULL,
  binSize = 10L,
  fragLength = 200L,
  minPeakCount = 5L,
  minFoldChange = 1.3,
  pthres = 10^-3,
  maxSize = NULL,
  bgWindow = c(1, 5, 10) * 1000,
  pseudoCount = 1L,
  useStrand = TRUE,
  outFormat = c("custom", "narrowPeak"),
  verbose = TRUE,
  ...
)

Arguments

bam

A signal bam file

ctrl

An optional (but highly recommended) control bam file

paired

Logical, whether the reads are paired

type

The type of peaks to identify ('narrow' or 'broad').

blacklist

An optional 'GRanges' of regions to be excluded (or the path to such a file). Since the blacklisted regions are removed from both the signal and control peaks, this also has an important impact on the empirical FDR (when 'ctrl' is given).

binSize

Binsize used to estimate peak shift

fragLength

Fragment length. Ignored if 'paired=TRUE'. This is only used for the initial candidate region identification, and sizes are adjusted after, so it doesn't need to be very precise.

minPeakCount

The minimum summit count for a region to be considered. Decreasing this can substantially increase the running time.

minFoldChange

The minimum fold-change for a region to be considered. Decreasing this can substantially increase the running time.

pthres

The p-value threshold to use

bgWindow

The windows to consider (in addition to the peak itself) for local background.

outFormat

The output format ('custom' or 'narrowPeak')

verbose

Logical; whether to output progress messages

...

Passed to bamChrChunkApply

Details

'callPeaks' takes about twice as long to run as MACS2, and uses more memory. If dealing with very large files (or a very low memory system), consider increasing the number of processing chunks, for instance with 'nChunks=10'.

The function uses bamChrChunkApply to obtain the coverages, and can accept any argument of that function. This means that the 'mapqFilter' and bam 'flgs' arguments can be used to restrict the reads used.

Value

A 'GRanges'


ETHZ-INS/epiwraps documentation built on Oct. 27, 2024, 8:02 p.m.