randomizedBackground: Randomized Background Estimation

Description Usage Arguments Details Value Warning Author(s) See Also

View source: R/randomizedBackground.R

Description

This function implements a randomized algorithm to generate the empirical cumulative density function (ecdf) for the probability of a peak containing a certain number of reads. It works by sampling a large number of peaks from the peak length distribution, randomizing their locations on the genome and counting the number of reads within this 'random peak'. The read counts within these random peaks are then used to generate the ecdf and assign empirical p-values to each peak.

Usage

1
randomizedBackground(peaks, chip, samplesize = 1e+06, winsize = 200)

Arguments

peaks

File containing peak locations. Must be in bedGraph format.

chip

Padded bedGraph file for the ChIP data being analyzed.

samplesize

Number of peaks to sample for building ecdf with the randomized algorithm.

winsize

Window size used for obtaining peaks(default = 200 bp).

Details

For a detailed description of the bedGraph file format see http://genome.ucsc.edu/goldenPath/help/bedgraph.html.

Value

Returns a matrix containing peak locations and their corresponding p-values sorted by chromosome. Has the following columns:

chromosome

Chromosome name

start

Start position of peak (1-based indexing)

end

End position of peak

reads

Number of reads in the peak

p-value

P-value for the peak obtained from the estimated ecdf

Warning

For small window sizes for the padded bedGraph files and consequently large file sizes, reading the data may take a long time. Run time may also be increased by larger values of samplesize.

Author(s)

Apratim Mitra

See Also

sample, ecdf


WaveSeqR documentation built on May 2, 2019, 5:19 p.m.