Description Usage Arguments Details Author(s) See Also Examples
Calculates the shifting score for all consensus clusters (promoters) between two specified (groups of) CAGE datasets. Shifting score is a measure of differential usage of TSSs within consensus cluster between two samples, which indicates the degree of physical separation of TSSs used in these samples within given consensus cluster. In addition to shifting score, a statistical significance (P-value and FDR) of differential TSS usage is calculated for each consensus cluster using Kolmogorov-Smirnov test.
1 2 3 4 5 6 | scoreShift(object, groupX, groupY, testKS = TRUE, useTpmKS = TRUE,
useMulticore = F, nrCores = NULL)
## S4 method for signature 'CAGEset'
scoreShift(object, groupX, groupY, testKS = TRUE,
useTpmKS = TRUE, useMulticore = F, nrCores = NULL)
|
object |
A |
groupX, groupY |
Character vector of the one or more CAGE dataset labels in the first
( |
testKS |
Logical, should Kolomogorov-Smirnov test for statistical significance of differential TSS usage be performed, and P-values and FDR returned. See Details. |
useTpmKS |
Logical, should normalized (tpm) values ( |
useMulticore |
Logical, should multicore be used. |
nrCores |
Number of cores to use when |
TSSs within one consensus cluster (promoter) can be used differently in different samples (cell types, tissues, developmental stages), with respect to their position and frequency of usage detected by CAGE. This function calculates shifting scores of all consensus clusters between two specified (groups of) CAGE samples to detect promoters that are used differently in these two samples. Shifting score is a measure of differential TSS usage defined as:
score = max(F1 - F2) / max(F1)
where F1 is a cumulative sum of CAGE signal along consensus cluster in the group of samples
with lower total signal in that consensus cluster, and F2 in the opposite group. Since
cumulative sum can be calculated in both forward (5' -> 3') and reverse (3' -> 5')
direction, shifting score is calculated for both cases and the bigger value is selected as
final shifting score. Value of the shifting score is in the range [-Inf, 1]
, where
value of 1
means complete physical separation of TSSs used in the two samples for
given consensus cluster. In general, any non-negative value of the shifting score can be
interpreted as the proportion of transcription initiation in the sample with lower expression
that is happening "outside" (either upstream or downstream) of the region used for
transcription initiation in the other sample. Negative values indicate no physical
separation, i.e. the region used for transcription initiation in the sample with
lower expression is completely contained within the region used for transcription
initiation in the other sample.
In addition to shifting score which indicates only physical separation (upstream or
downstream shift of TSSs), a more general assessment of differential TSS usage can be
obtained by performing a two-sample Kolmogorov-Smirnov test on cumulative sums of CAGE
signal along the consensus cluster. In that case, cumulative sums in both samples are
scaled to range [0,1]
and are considered to be empirical cumulative distribution functions
(ECDF) reflecting sampling of TSS positions during transcription initiation.
Kolmogorov-Smirnov test is performed to assess whether the two underlying probability
distributions differ. To obtain P-value (i.e. the level at which the
null-hypothesis can be rejected), sample sizes that generated the ECDFs are required, in
addition to actual K-S statistics calculated from ECDFs. These are derived either from
raw tag counts, i.e. exact number of times each TSS in the cluster was sampled
during sequencing (when useTpmKS = FALSE
), or from normalized tpm values (when
useTpmKS = TRUE
). P-values obtained from K-S tests are further adjusted for
multiple testing using Benjamini & Hochberg (BH) method and for each P-value a
corresponding false-discovery rate (FDR) is also reported.
Since calculation of shifting scores and Kolmogorov-Smirnov test require cumulative sums
along consensus clusters, they have to be calculated beforehand by calling
cumulativeCTSSdistribution
function.
The slots shiftingGroupX
, shiftingGroupY
and
consensusClustersShiftingScores
of the provided CAGEset
object will
be occupied by the information on the groups of CAGE datasets that have been compared and
shifting scores of all consensus clusters. Consensus clusters (promoters) with shifting
score and/or FDR above specified threshold can be extracted by calling
getShiftingPromoters
function.
Vanja Haberle
Other CAGEr promoter shift functions: getShiftingPromoters
1 2 3 4 5 | scoreShift( exampleCAGEset
, groupX = c("sample1", "sample2")
, groupY = "sample3"
, testKS = TRUE, useTpmKS = FALSE)
head(getShiftingPromoters(exampleCAGEset))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.