Estimate summaries of the distribution of fragment lengths in a shortread experiment. The methods are designed for ChIPSeq experiments and may not work well in data without peaks.
Description
estimate.mean.fraglen
implements three methods for estimating
mean fragment length. The other functions are related helper
functions implementing various methods, but may be useful by
themselves for diagnostic purposes. Many of these operations are
potentially slow.
sparse.density
is intended to be similar to
density
, but returns the results in a runlength encoded
form. This is useful when long stretches of the range of the data
have zero density.
Usage
1 2 3 4 5 6 7 8 9 10 11  estimate.mean.fraglen(x, method = c("SISSR", "coverage", "correlation"),
...)
basesCovered(x, shift = seq(5, 300, 5), seqLen = 100, verbose = FALSE)
densityCorr(x, shift = seq(0, 500, 5), center = FALSE,
width = seqLen *2L, seqLen=100L, maxDist = 500L, ...)
sparse.density(x, width = 50, kernel = "epanechnikov",
from = start(rix)[1]  10L,
to = end(rix)[length(rix)] + 10L)

Arguments
x 
For For For 
method 
Character string giving method to be used.

shift 
Integer vector giving amount of shifts to be tried when optimizing. The current algorithm simply evaluates all supplied values and reports the one giving minimum coverage or maximum correlation. 
seqLen 
For the 
verbose 
Logical specifying whether progress information should be printed during execution. 
center 
For the 
width 
halfbandwidth used in the computation. This needs to be specified as an integer, datadriven rules are not supported. 
kernel 
A character string giving the density kernel. 
from, to 
specifies range over which the density is to be computed. 
maxDist 
If distance to nearest neighbor is more than this, the position is discarded. This removes isolated points, which are not very informative. 
... 
Extra arguments, passed on as appropriate to other functions. 
Details
For the correlation method, the range over which densities are computed only cover the range of reads; that is, the beginning and end of chromosomes are excluded.
Value
estimate.mean.fraglen
gives an estimate of the mean fragment
length.
basesCovered
and densityCorr
give a vector of the
corresponding objective function evaluated at the supplied values of
shift
.
sparse.density
returns an object of class "Rle"
.
Author(s)
Deepayan Sarkar, Michael Lawrence
References
R. Jothi, S. Cuddapah, A. Barski, K. Cui, and K. Zhao. Genomewide identification of in vivo proteinDNA binding sites from ChIPSeq data. Nucleic Acids Research, 36:5221–31, 2008.
P. V. Kharchenko, M. Y. Tolstorukov, and P. J. Park. Design and analysis of ChIP experiments for DNAbinding proteins. Nature Biotechnology, 26:1351–1359, 2008.
Examples
1 2  data(cstest)
estimate.mean.fraglen(cstest[["ctcf"]], method = "coverage")
