Description Usage Arguments Details Value Warning Author(s) See Also
View source: R/randomizedBackground.R
This function implements a randomized algorithm to generate the empirical cumulative density function (ecdf) for the probability of a peak containing a certain number of reads. It works by sampling a large number of peaks from the peak length distribution, randomizing their locations on the genome and counting the number of reads within this 'random peak'. The read counts within these random peaks are then used to generate the ecdf and assign empirical p-values to each peak.
1 | randomizedBackground(peaks, chip, samplesize = 1e+06, winsize = 200)
|
peaks |
File containing peak locations. Must be in bedGraph format. |
chip |
Padded bedGraph file for the ChIP data being analyzed. |
samplesize |
Number of peaks to sample for building ecdf with the randomized algorithm. |
winsize |
Window size used for obtaining peaks(default = 200 bp). |
For a detailed description of the bedGraph file format see http://genome.ucsc.edu/goldenPath/help/bedgraph.html.
Returns a matrix containing peak locations and their corresponding p-values sorted by chromosome. Has the following columns:
chromosome |
Chromosome name |
start |
Start position of peak (1-based indexing) |
end |
End position of peak |
reads |
Number of reads in the peak |
p-value |
P-value for the peak obtained from the estimated ecdf |
For small window sizes for the padded bedGraph files and consequently
large file sizes, reading the data may take a long time. Run time may
also be increased by larger values of samplesize
.
Apratim Mitra
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.