Description Usage Arguments Details Value Methods Author(s) See Also Examples
Identify transcription start sites in sequence read count data.
1 2 | identifyStartSites(x, threshold=1, tau=c(20, 20), neighbor=TRUE,
fun=subtractExpectation, multicore=TRUE, ...)
|
x |
Object of class |
threshold |
Numeric with the minimal number of reads to be treated as a potential TSS. |
tau |
Numeric vector of length two specifying the \sQuote{tau} parameter of the exponential function for each side of the segment. For the forward strand (“+”), the first and second value refer to the side towards the 5' and 3' end, respectively. In the case that a single value is provided it is applied to both sides. |
neighbor |
Logical whether the background estimates should be iteratively assigned to the predicted TSS during the estimation (default: TRUE). |
fun |
Function to calculate the expectation for each TSS. For details, see the ‘details’ section. |
multicore |
Logical whether to use the parallel package to speed up the computation. Has only an effect if the package is available and loaded. For details, see the ‘details’ section. |
... |
Additional arguments passed for the parallel package if used. For details, see the ‘details’ section. |
After normalization of the count data, an iterative algorithm is applied for each segment to identify the TSS.
The expected number of false positive counts is initialized with a default value given by the read frequency in the whole data set. The position with the largest counts above is identified as a TSS, if the expected transcription level is at least one read above the expected number of false positive reads. The transcription levels for all TSS are calculated by adding all counts to their nearest neighbor TSS.
Then, the expected number of false positive reads is updated by convolution with exponential kernels. The decay rates tau in 3' direction and towards the 5'-end can be chosen differently to account for the fact that false positive counts are preferably found in 5' direction of a TSS. This procedure is iterated as long as the set of TSS increases.
In order to distribute the identification step over multiple processor
cores, the mclapply
function of the parallel package can
be used. For this, the parallel package has to be loaded
manually before starting the computation, additional parameters are
passed via the ...
argument, e.g.as normalizeCounts(x,
mc.cores=2)
. The multicore
argument can further be used to
temporarily disable the parallel estimation by setting it to
FALSE
. Pleas note that the identification step is normally very
fast and thus using parallel computation here may a minor impact
as compared to the normalizeCounts
method.
An object of class TssResult
.
Identify TSS:
signature(x="TssData")
identifyStartSites(x, ...)
Maintainer: Julian Gehring <julian.gehring@fdm.uni-freiburg.de>
Classes:
TssData
, TssNorm
,
TssResult
Methods:
segmentizeCounts
, normalizeCounts
,
identifyStartSites
, get-methods
,
plot-methods
, asRangedData-methods
Functions:
subtract-functions
Data set:
physcoCounts
Package:
TSSi-package
1 2 3 4 5 6 7 | ## preceding steps
example(normalizeCounts)
## identify TSS
z <- identifyStartSites(yFit)
z
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.