gcapcPeaks: GC Effects Aware Peak Calling

Description Usage Arguments Value Examples

View source: R/gcapcPeaks.r

Description

This function calls ChIP-seq peaks using potential GC effects information. Enrichment scores are calculated on sliding windows of prefiltered large regions, with GC effects considered. Permutation analysis is used to determine significant binding peaks.

Usage

1
2
3
gcapcPeaks(coverage, gcbias, bdwidth, flank = NULL, prefilter = 4L,
  permute = 5L, pv = 0.05, plot = FALSE, genome = "hg19",
  gctype = c("ladder", "tricube"))

Arguments

coverage

A list object returned by function read5endCoverage.

gcbias

A list object returned by function gcEffects.

bdwidth

A non-negative integer vector with two elements specifying ChIP-seq binding width and peak detection half window size. Usually generated by function bindWidth. A bad estimation of bdwidth results no meaning of downstream analysis. The values need to be the same as it is when calculating gcbias.

flank

A non-negative integer specifying the flanking width of ChIP-seq binding. This parameter provides the flexibility that reads appear in flankings by decreased probabilities as increased distance from binding region. This paramter helps to define effective GC content calculation. Default is NULL, which means this paramater will be calculated from bdwidth. However, if customized numbers provided, there won't be recalucation for this parameter; instead, the 2nd elements of bdwidth will be recalculated based on flank. The value needs to be the same as it is when calculating gcbias.

prefilter

A non-negative integer specifying the minimum of reads to qualify a potential binding region. Regions with total of reads from forward and reverse strands larger or equivalent to prefilter are selected for downstream analysis. Default is 4.

permute

A non-negative integer specifying times of permutation to be performed. Default is 5. When whole large genome is used, such as human genome, 5 times of permutation could be enough.

pv

A numeric specifying p-value cutoff for significant binding peaks. Default is 0.05.

plot

A logical vector which, when TRUE (default), returns density plots of real and permutation enrichment scores.

genome

A BSgenome object containing the sequences of the reference genome that was used to align the reads, or the name of this reference genome specified in a way that is accepted by the getBSgenome function defined in the BSgenome software package. In that case the corresponding BSgenome data package needs to be already installed (see ?getBSgenome in the BSgenome package for the details). The value needs to be the same as it is when calculating gcbias.

gctype

A character vector specifying choice of method to calculate effective GC content. Default ladder is based on uniformed fragment distribution. A more smoother method based on tricube assumption is also allowed. However, tricube should be not used if estimated peak half size is 3 times or more larger than estimated bind width. The value needs to be the same as it is when calculating gcbias.

Value

A GRanges of peaks with meta columns:

es

Estimated enrichment score.

pv

p-value.

Examples

1
2
3
4
5
bam <- system.file("extdata", "chipseq.bam", package="gcapc")
cov <- read5endCoverage(bam)
bdw <- bindWidth(cov)
gcb <- gcEffects(cov, bdw, sampling = c(0.15,1))
gcapcPeaks(cov, gcb, bdw)

gcapc documentation built on Nov. 8, 2020, 8:14 p.m.