HiCcomparator: An S3 class to represent object for Hi-C maps comparisons.

Description Usage Arguments Details Value See Also Examples

View source: R/differential_analysis.R

Description

HiC comparator object stores Hi-C contact maps from 2 experiments and (optionally) TADs and allows for convenient access to contact matrices, A/B compartments or TADs. HiCcomparator is constructed from npz files containing Hi-C maps in python dict with numpy matrices. Additionally TAD set may be given to HiCcomparator (as list of data frames, where data frames names match those of Hi-C matrices names). One can also choose to determine TADs based on given Hi-C contact maps - only first, only second or determine both and take intersecting intervals between them.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
HiCcomparator(
  path1,
  path2,
  tads = NULL,
  mtx.names = "all",
  which.tads = 4,
  do.pca = FALSE,
  agg.diags = TRUE,
  which.test = c("energy", "KS")[1],
  exclude.outliers = FALSE
)

Arguments

path1

character - path to npz file containing first set of Hi-C maps

path2

character - path to npz file containing second set of Hi-C maps

tads

list (optional), set of TADs as named list of data frames, each with at least start, end columns

mtx.names

character vector with subset of Hi-C maps names to be selected for analysis, by default all matrices are used

which.tads

numeric indicating what to do if no TADs are specified: 1 - determine TADs from first set of Hi-C maps, 2 - determine TADs from second set of Hi-C maps, 3 - determine from both sets and then take their intersection, 4 - do not determine TADs

do.pca

logical whether to perform PCA for given maps and determine A/B compartments

agg.diags

logical whether to perform diagonal pooling (see details), true by default

which.test

character either energy statistic based test (default) or KS, which test to perform during diagonals pooling (see details)

exclude.outliers

logical see aggregate_diagonals for details

Details

If agg.diags is true then an attempt to pull in diagonals with similar X, Y distribution will be made. The reason for this is to increase the number of observations for model fitting and increase the potential range of diagonals where a model can be fit. The procedure proceeds as follows:

  1. set k to 1 and take diagonal k

  2. set l to k + 1 an take diagonal l

  3. test null hypothesis that points X,Y (where coordinate X - number of contacts in cell i,j in contact map 1 and Y - number of contacts in cell i,j in contact map 2) of diagonal l were sampled from distribution X,Y of diagonal k

    • if rejected, i.e. alternative is true: the distribution X,Y of diagonal k and l are different then pool in diagonals (k, ..., l-1), fix k = l and go to step 2

    • otherwise fix l = l + 1 and perform step 3

  4. repeat steps 2-3 until all diagonals are examined

To test the hypothesis of equality between distributions X,Y of diagonals k and l energy statistic based test (from package energy) is used. Alternatively a very crude, approximate solution to speed up diagonal aggregation is to calculate the product of X and Y and compare the univariate distributions of products using Kolmogorov Smirnoff test instead using bivariate X,Y distributions. This option is available through setting which.test parameter to "KS".

Value

S3 object of class HiCcomparator

See Also

read_npz for reading npz files, do_pca on how A/B compartments are determined, map2tads how TADs are determined, eqdist.etest for energy statistic based test

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# get path of first sample maps
mtx1.fname <- system.file("extdata", "IMR90-MboI-1_40kb-raw.npz", package = "DIADEM", mustWork = TRUE)
# get path of second sample maps
mtx2.fname <- system.file("extdata", "MSC-HindIII-1_40kb-raw.npz", package = "DIADEM", mustWork = TRUE)
# get sample TADs
tads <- DIADEM::sample_tads[c("IMR90-MboI-1_40kb-raw", "MSC-HindIII-1_40kb-raw")]
# construct HiCcomparator object for chromosomes 18 and 19
hic.comparator <- HiCcomparator(mtx1.fname, mtx2.fname, tads, mtx.names = c("18","19"))
# plot A/B compartments for first and second map in chromosome 19
plot_pc_vector(hic.comparator$pc1.maps1[["19"]]) # first map
plot_pc_vector(hic.comparator$pc1.maps2[["19"]]) # second map

rz6/DIADEM documentation built on Dec. 31, 2019, 3:51 a.m.