Description Usage Arguments Value Methods Author(s) See Also Examples
filterDuplReads
filters highly repeated sequences, i.e. with the same chromosome, start and
end positions.
As many such sequences are likely due to overamplification artifacts, this
can be a useful preprocessing step for ultra highthroughput sequencing
data.
A false discovery rate is computed for each number of repeats being
unusually high. The reads with a higher false discovery rate will be
removed. For more information on the false discovery rate calculation
please read the fdrEnrichment
manual.
tabDuplReads
counts the number reads with no duplications, duplicated once, twice etc.
1 2 3  filterDuplReads(x, maxRepeats, fdrOverAmp=0.01, negBinomUse=.999,components=0, mc.cores=1)
tabDuplReads(x, minRepeats=1, mc.cores=1)

x 
Object containing read locations.
Currently methods for 
maxRepeats 
Reads appearing 
fdrOverAmp 
Reads with false discovery rate of being
overamplified greater than 
negBinomUse 
Number of counts that will be used to compute the null distribution. Using 1  1/1000 would mean that 99.9% of the reads will be used. The ones with higher number of repetitions are the excluded ones. 
components 
number of negative binomials that will be used to fit null distribution. The default value is 1. This value hase to be between 0 and 4. If 0 is given the optimal number of negative biomials is choosen using the Bayesian information criterion (BIC) 
mc.cores 
Number of cores to be used in parallel computing
(passed on to 
minRepeats 
The table is only produced for reads with at least

filterDuplReads
returns x
without highly
repetitive sequencesas, determined by
maxRepeats
or ppOverAmp
.
tabDuplReads
returns a table counting the number of sequences
repeating 1 times, 2 times, 3 times etc.
Methods for filterDuplReads
and tabDuplReads
signature(x = "RangedData")
Two reads are duplicated if they have the same space, start and end position.
signature(x = "list")
The method is applied
separately to each RangedData
element in the list.
Evarist Planet, David Rossell, Oscar Flores
fdrEnrichedCounts
to compute the posterior probability
that a certain number of repeats is due to overamplification.
1 2 3 4 5 6 7 8 9 10 11 12  set.seed(1)
st < round(rnorm(1000,500,100))
strand < rep(c('+',''),each=500)
space < sample(c('chr1','chr2'),size=length(st),replace=TRUE)
sample1 < RangedData(IRanges(st,st+38),strand=strand,space=space)
#Add artificial repeats
st < rep(400,20)
repeats < RangedData(IRanges(st,st+38),strand='+',space='chr1')
sample1 < rbind(sample1,repeats)
filterDuplReads(sample1)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.