refinePeaks: Refine Peaks with GC Effects

Description Usage Arguments Value Examples

View source: R/refinePeaks.r

Description

This function refines the ranks (i.e. significance/pvalue) of pre-determined peaks by potential GC effects. These peaks can be obtained from other peak callers, e.g. MACS or SPP.

Usage

1
2
refinePeaks(coverage, gcbias, bdwidth, peaks, flank = NULL, permute = 5L,
  genome = "hg19", gctype = c("ladder", "tricube"))

Arguments

coverage

A list object returned by function read5endCoverage.

gcbias

A list object returned by function gcEffects.

bdwidth

A non-negative integer vector with two elements specifying ChIP-seq binding width and peak detection half window size. Usually generated by function bindWidth. A bad estimation of bdwidth results no meaning of downstream analysis. The values need to be the same as it is when calculating gcbias.

peaks

A GRanges object specifying the peaks to be refined. A flexible set of peaks are preferred to reduce potential false negative, meaning both significant (e.g. p<=0.05) and non-significant (e.g. p>0.05) peaks are preferred to be included. If the total number of peaks is not too big, a reasonable set of peaks include all those with p-value/FDR less than 0.99 by other peak callers.

flank

A non-negative integer specifying the flanking width of ChIP-seq binding. This parameter provides the flexibility that reads appear in flankings by decreased probabilities as increased distance from binding region. This paramter helps to define effective GC content calculation. Default is NULL, which means this paramater will be calculated from bdwidth. However, if customized numbers provided, there won't be recalucation for this parameter; instead, the 2nd elements of bdwidth will be recalculated based on flank. The value needs to be the same as it is when calculating gcbias.

permute

A non-negative integer specifying times of permutation to be performed. Default is 5. When whole large genome is used, such as human genome, 5 times of permutation could be enough.

genome

A BSgenome object containing the sequences of the reference genome that was used to align the reads, or the name of this reference genome specified in a way that is accepted by the getBSgenome function defined in the BSgenome software package. In that case the corresponding BSgenome data package needs to be already installed (see ?getBSgenome in the BSgenome package for the details). The value needs to be the same as it is when calculating gcbias.

gctype

A character vector specifying choice of method to calculate effective GC content. Default ladder is based on uniformed fragment distribution. A more smoother method based on tricube assumption is also allowed. However, tricube should be not used if estimated peak half size is 3 times or more larger than estimated bind width. The value needs to be the same as it is when calculating gcbias.

Value

A GRanges object the same as peaks with two additional meta columns:

newes

Refined enrichment scores.

newpv

Refined pvalues.

Examples

1
2
3
4
5
6
bam <- system.file("extdata", "chipseq.bam", package="gcapc")
cov <- read5endCoverage(bam)
bdw <- bindWidth(cov)
gcb <- gcEffects(cov, bdw, sampling = c(0.15,1))
peaks <- gcapcPeaks(cov, gcb, bdw)
refinePeaks(cov, gcb, bdw, peaks)

gcapc documentation built on Nov. 8, 2020, 8:14 p.m.