Description Usage Arguments Details Value Warning - This might differ not be the correct implementation Author(s) References See Also Examples
View source: R/overlapScores.R
Calculates Overlap Scores Between Two Sets of Topological Domains
1 | overlapScores(a, reference, debug = getOption("TopDom.debug", FALSE))
|
a, reference |
Topological domain (TD) set A and TD reference
set R both in a format as returned by |
debug |
If |
The overlap score, overlap(A', r_i), represents how well a consecutive subset A' of topological domains (TDs) in A overlap with topological domain r_i in reference set R. For each reference TD r_i, the best match A'_max is identified, that is, the A' subset that maximize overlap(A', r_i). For exact definitions, see Page 8 in Shin et al. (2016).
Note that the overlap score is an asymmetric score, which means that
overlapScores(a, b) != overlapScores(b, a)
.
Returns a named list of class TopDomOverlapScores
, where the names
correspond to the chromosomes in domain reference set R.
Each of these chromosome elements contains a data.frame with fields:
chromosome
- D_{R,c} character strings
best_score
- D_{R,c} numerics in [0,1]
best_length
- D_{R,c} positive integers
best_set
- list of D_{R,c} index vectors
where D_{R,c} is the number of TDs in reference set R on
chromosome c. If a TD in reference R is not a "domain"
,
then the corresponding best_score
and best_length
values are
NA_real_
and NA_integer_
, respectively, while best_set
is an empty
list.
The original TopDom scripts do not provide an implementation for
calculating overlap scores. Instead, the implementation of
TopDom::overlapScores()
is based on the textual description of
overlap scores provided in Shin et al. (2016). It is not known if this
is the exact same algorithm and implementation as the authors of the
TopDom article used.
Henrik Bengtsson - based on the description in Shin et al. (2016).
Shin et al., TopDom: an efficient and deterministic method for identifying topological domains in genomes, Nucleic Acids Research, 44(7): e70, April 2016. doi: 10.1093/nar/gkv1505, PMCID: PMC4838359, PMID: 26704975
TopDom.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | library(tibble)
path <- system.file("exdata", package = "TopDom", mustWork = TRUE)
## Original count data (on a subset of the bins to speed up example)
chr <- "chr19"
pathname <- file.path(path, sprintf("nij.%s.gz", chr))
data <- readHiC(pathname, chr = chr, binSize = 40e3, bins = 1:500)
print(data)
## Find topological domains using TopDom method for two window sizes
tds_5 <- TopDom(data, window.size = 5L)
tds_6 <- TopDom(data, window.size = 6L)
## Overlap scores (in both directions)
overlap_56 <- overlapScores(tds_6, reference = tds_5)
print(overlap_56)
print(as_tibble(overlap_56))
overlap_65 <- overlapScores(tds_5, reference = tds_6)
print(overlap_65)
print(as_tibble(overlap_65))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.