Description Usage Arguments Details Value See Also Examples
View source: R/differential_analysis.R
HiC comparator object stores Hi-C contact maps from 2 experiments and (optionally) TADs and allows for convenient access to contact matrices, A/B compartments or TADs. HiCcomparator is constructed from npz files containing Hi-C maps in python dict with numpy matrices. Additionally TAD set may be given to HiCcomparator (as list of data frames, where data frames names match those of Hi-C matrices names). One can also choose to determine TADs based on given Hi-C contact maps - only first, only second or determine both and take intersecting intervals between them.
1 2 3 4 5 6 7 8 9 10 11 |
path1 |
character - path to npz file containing first set of Hi-C maps |
path2 |
character - path to npz file containing second set of Hi-C maps |
tads |
list (optional), set of TADs as named list of data frames, each with at least start, end columns |
mtx.names |
character vector with subset of Hi-C maps names to be selected for analysis, by default all matrices are used |
which.tads |
numeric indicating what to do if no TADs are specified: 1 - determine TADs from first set of Hi-C maps, 2 - determine TADs from second set of Hi-C maps, 3 - determine from both sets and then take their intersection, 4 - do not determine TADs |
do.pca |
logical whether to perform PCA for given maps and determine A/B compartments |
agg.diags |
logical whether to perform diagonal pooling (see details), true by default |
which.test |
character either energy statistic based test (default) or KS, which test to perform during diagonals pooling (see details) |
exclude.outliers |
logical see |
If agg.diags
is true then an attempt to pull in diagonals with similar X, Y distribution will be made. The reason for this is to increase the number of observations for model fitting and increase the potential range of diagonals where a model can be fit. The procedure proceeds as follows:
set k
to 1
and take diagonal k
set l
to k + 1
an take diagonal l
test null hypothesis that points X,Y (where coordinate X - number of contacts in cell i,j in contact map 1 and Y - number of contacts in cell i,j in contact map 2) of diagonal l
were sampled from distribution X,Y of diagonal k
if rejected, i.e. alternative is true: the distribution X,Y of diagonal k
and l
are different then pool in diagonals (k
, ..., l-1
), fix k
= l
and go to step 2
otherwise fix l
= l + 1
and perform step 3
repeat steps 2-3 until all diagonals are examined
To test the hypothesis of equality between distributions X,Y of diagonals k
and l
energy statistic based test (from package energy) is used. Alternatively a very crude, approximate solution to speed up diagonal aggregation is to calculate the product of X and Y and compare the univariate distributions of products using Kolmogorov Smirnoff test instead using bivariate X,Y distributions. This option is available through setting which.test
parameter to "KS".
S3 object of class HiCcomparator
read_npz
for reading npz files, do_pca
on how A/B compartments are determined, map2tads
how TADs are determined, eqdist.etest
for energy statistic based test
1 2 3 4 5 6 7 8 9 10 11 | # get path of first sample maps
mtx1.fname <- system.file("extdata", "IMR90-MboI-1_40kb-raw.npz", package = "DIADEM", mustWork = TRUE)
# get path of second sample maps
mtx2.fname <- system.file("extdata", "MSC-HindIII-1_40kb-raw.npz", package = "DIADEM", mustWork = TRUE)
# get sample TADs
tads <- DIADEM::sample_tads[c("IMR90-MboI-1_40kb-raw", "MSC-HindIII-1_40kb-raw")]
# construct HiCcomparator object for chromosomes 18 and 19
hic.comparator <- HiCcomparator(mtx1.fname, mtx2.fname, tads, mtx.names = c("18","19"))
# plot A/B compartments for first and second map in chromosome 19
plot_pc_vector(hic.comparator$pc1.maps1[["19"]]) # first map
plot_pc_vector(hic.comparator$pc1.maps2[["19"]]) # second map
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.