TopDom: Identify Topological Domains from a Hi-C Contact Matrix

Description Usage Arguments Value Windows size Author(s) References Examples

View source: R/TopDom.R

Description

Identify Topological Domains from a Hi-C Contact Matrix

Usage

1
2
3
4
5
6
7
8
TopDom(
  data,
  window.size,
  outFile = NULL,
  statFilter = TRUE,
  ...,
  debug = getOption("TopDom.debug", FALSE)
)

Arguments

data

A TopDomData object, or the pathname to a normalized Hi-C contact matrix file as read by readHiC(), that specify N bins.

window.size

The number of bins to extend (as a non-negative integer). Recommended range is in 5, ..., 20.

outFile

(optional) The filename without extension of the three result files optionally produced. See details below.

statFilter

(logical) Specifies whether non-significant topological-domain boundaries should be dropped or not.

...

Additional arguments passed to readHiC().

debug

If TRUE, debug output is produced.

Value

A named list of class TopDom with data.frame elements binSignal, domain, and bed.

If argument outFile is non-NULL, then the three elements (binSignal, domain, and bed) returned are also written to tab-delimited files with file names ‘<outFile>.binSignal’, ‘<outFile>.domain’, and ‘<outFile>.bed’, respectively. None of the files have row names, and all but the BED file have column names.

Windows size

The window.size parameter is by design the only tuning parameter in the TopDom method and affects the amount of smoothing applied when calculating the TopDom bin signals. The binning window extends symmetrically downstream and upstream from the bin such that the bin signal is the average window.size^2 contact frequencies. For details, see Equation (1) and Figure 1 in Shin et al. (2016). Typically, the number of identified TDs decreases while their average lengths increase as this window-size parameter increases (Figure 2). The default is window.size = 5 (bins), which is motivated as: "Considering the previously reported minimum TD size (approx. 200 kb) (Dixon et al., 2012) and our bin size of 40 kb, w[indow.size] = 5 is a reasonable setting" (Shin et al., 2016).

Author(s)

Hanjun Shin, Harris Lazaris, and Gangqing Hu. R package, help, and code refactoring by Henrik Bengtsson.

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
path <- system.file("exdata", package = "TopDom", mustWork = TRUE)

## Original count data (on a subset of the bins to speed up example)
chr <- "chr19"
pathname <- file.path(path, sprintf("nij.%s.gz", chr))
data <- readHiC(pathname, chr = chr, binSize = 40e3, bins = 1:500)
print(data)  ## a TopDomData object

## Find topological domains using the TopDom method
fit <- TopDom(data, window.size = 5L)
print(fit)  ## a TopDom object

## Display the largest domain
td <- subset(subset(fit$domain, tag == "domain"), size == max(size))
print(td) ## a data.frame

## Subset TopDomData object
data_s <- subsetByRegion(data, region = td, margin = 0.9999)
print(data_s)  ## a TopDomData object

vp <- grid::viewport(angle = -45, width = 0.7, y = 0.3)
gg <- ggCountHeatmap(data_s)
gg <- gg + ggDomain(td, color = "#cccc00") + ggDomainLabel(td)
print(gg, newpage = TRUE, vp = vp)

gg <- ggCountHeatmap(data_s, colors = list(mid = "white", high = "black"))
gg_td <- ggDomain(td, delta = 0.08)
dx <- attr(gg_td, "gg_params")$dx
gg <- gg + gg_td + ggDomainLabel(td, vjust = 2.5)
print(gg, newpage = TRUE, vp = vp)

## Subset TopDom object
fit_s <- subsetByRegion(fit, region = td, margin = 0.9999)
print(fit_s)  ## a TopDom object
for (kk in seq_len(nrow(fit_s$domain))) {
  gg <- gg + ggDomain(fit_s$domain[kk, ], dx = dx * (4 + kk %% 2), color = "red", size = 1)
}

print(gg, newpage = TRUE, vp = vp)


gg <- ggCountHeatmap(data_s)
gg_td <- ggDomain(td, delta = 0.08)
dx <- attr(gg_td, "gg_params")$dx
gg <- gg + gg_td + ggDomainLabel(td, vjust = 2.5)
fit_s <- subsetByRegion(fit, region = td, margin = 0.9999)
for (kk in seq_len(nrow(fit_s$domain))) {
  gg <- gg + ggDomain(fit_s$domain[kk, ], dx = dx * (4 + kk %% 2), color = "blue", size = 1)
}

print(gg, newpage = TRUE, vp = vp)

TopDom documentation built on May 6, 2021, 9:07 a.m.