refinePeaks: Refine Peaks with GC Effects
In gcapc: GC Aware Peak Caller

Description Usage Arguments Value Examples

View source: R/refinePeaks.r

This function refines the ranks (i.e. significance/pvalue) of pre-determined peaks by potential GC effects. These peaks can be obtained from other peak callers, e.g. MACS or SPP.

1 2	refinePeaks(coverage, gcbias, bdwidth, peaks, flank = NULL, permute = 5L, genome = "hg19", gctype = c("ladder", "tricube"))

`coverage`	A list object returned by function `read5endCoverage`.
`gcbias`	A list object returned by function `gcEffects`.
`bdwidth`	A non-negative integer vector with two elements specifying ChIP-seq binding width and peak detection half window size. Usually generated by function `bindWidth`. A bad estimation of bdwidth results no meaning of downstream analysis. The values need to be the same as it is when calculating `gcbias`.
`peaks`	A GRanges object specifying the peaks to be refined. A flexible set of peaks are preferred to reduce potential false negative, meaning both significant (e.g. p<=0.05) and non-significant (e.g. p>0.05) peaks are preferred to be included. If the total number of peaks is not too big, a reasonable set of peaks include all those with p-value/FDR less than 0.99 by other peak callers.
`flank`	A non-negative integer specifying the flanking width of ChIP-seq binding. This parameter provides the flexibility that reads appear in flankings by decreased probabilities as increased distance from binding region. This paramter helps to define effective GC content calculation. Default is NULL, which means this paramater will be calculated from `bdwidth`. However, if customized numbers provided, there won't be recalucation for this parameter; instead, the 2nd elements of `bdwidth` will be recalculated based on `flank`. The value needs to be the same as it is when calculating `gcbias`.
`permute`	A non-negative integer specifying times of permutation to be performed. Default is 5. When whole large genome is used, such as human genome, 5 times of permutation could be enough.
`genome`	A BSgenome object containing the sequences of the reference genome that was used to align the reads, or the name of this reference genome specified in a way that is accepted by the `getBSgenome` function defined in the BSgenome software package. In that case the corresponding BSgenome data package needs to be already installed (see `?getBSgenome` in the BSgenome package for the details). The value needs to be the same as it is when calculating `gcbias`.
`gctype`	A character vector specifying choice of method to calculate effective GC content. Default `ladder` is based on uniformed fragment distribution. A more smoother method based on tricube assumption is also allowed. However, tricube should be not used if estimated peak half size is 3 times or more larger than estimated bind width. The value needs to be the same as it is when calculating `gcbias`.

A GRanges object the same as peaks with two additional meta columns:

`newes`	Refined enrichment scores.
`newpv`	Refined pvalues.

bam <- system.file("extdata", "chipseq.bam", package="gcapc")
cov <- read5endCoverage(bam)
bdw <- bindWidth(cov)
gcb <- gcEffects(cov, bdw, sampling = c(0.15,1))
peaks <- gcapcPeaks(cov, gcb, bdw)
refinePeaks(cov, gcb, bdw, peaks)