randomizedBackground: Randomized Background Estimation
In WaveSeqR: A Wavelet-based Method for ChIP-Seq peak calling

Description Usage Arguments Details Value Warning Author(s) See Also

This function implements a randomized algorithm to generate the empirical cumulative density function (ecdf) for the probability of a peak containing a certain number of reads. It works by sampling a large number of peaks from the peak length distribution, randomizing their locations on the genome and counting the number of reads within this 'random peak'. The read counts within these random peaks are then used to generate the ecdf and assign empirical p-values to each peak.

1	randomizedBackground(peaks, chip, samplesize = 1e+06, winsize = 200)

`peaks`	File containing peak locations. Must be in bedGraph format.
`chip`	Padded bedGraph file for the ChIP data being analyzed.
`samplesize`	Number of peaks to sample for building ecdf with the randomized algorithm.
`winsize`	Window size used for obtaining peaks(default = 200 bp).

For a detailed description of the bedGraph file format see http://genome.ucsc.edu/goldenPath/help/bedgraph.html.

Returns a matrix containing peak locations and their corresponding p-values sorted by chromosome. Has the following columns:

`chromosome`	Chromosome name
`start`	Start position of peak (1-based indexing)
`end`	End position of peak
`reads`	Number of reads in the peak
`p-value`	P-value for the peak obtained from the estimated ecdf

For small window sizes for the padded bedGraph files and consequently large file sizes, reading the data may take a long time. Run time may also be increased by larger values of samplesize.

Apratim Mitra

sample, ecdf

WaveSeqR documentation built on May 2, 2019, 5:19 p.m.