filterDuplReads filters highly repeated sequences, i.e. with the same chromosome, start and
As many such sequences are likely due to over-amplification artifacts, this
can be a useful pre-processing step for ultra high-throughput sequencing
A false discovery rate is computed for each number of repeats being
unusually high. The reads with a higher false discovery rate will be
removed. For more information on the false discovery rate calculation
please read the
tabDuplReads counts the number reads with no duplications, duplicated once, twice etc.
1 2 3
Object containing read locations.
Currently methods for
Reads with false discovery rate of being
over-amplified greater than
Number of counts that will be used to compute the null distribution. Using 1 - 1/1000 would mean that 99.9% of the reads will be used. The ones with higher number of repetitions are the excluded ones.
number of negative binomials that will be used to fit null distribution. The default value is 1. This value hase to be between 0 and 4. If 0 is given the optimal number of negative biomials is choosen using the Bayesian information criterion (BIC)
Number of cores to be used in parallel computing
(passed on to
The table is only produced for reads with at least
x without highly
repetitive sequencesas, determined by
tabDuplReads returns a table counting the number of sequences
repeating 1 times, 2 times, 3 times etc.
signature(x = "RangedData")
Two reads are duplicated if they have the same space, start and end position.
signature(x = "list")
The method is applied
separately to each
RangedData element in the list.
Evarist Planet, David Rossell, Oscar Flores
fdrEnrichedCounts to compute the posterior probability
that a certain number of repeats is due to over-amplification.
1 2 3 4 5 6 7 8 9 10 11 12
set.seed(1) st <- round(rnorm(1000,500,100)) strand <- rep(c('+','-'),each=500) space <- sample(c('chr1','chr2'),size=length(st),replace=TRUE) sample1 <- RangedData(IRanges(st,st+38),strand=strand,space=space) #Add artificial repeats st <- rep(400,20) repeats <- RangedData(IRanges(st,st+38),strand='+',space='chr1') sample1 <- rbind(sample1,repeats) filterDuplReads(sample1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.