README.md
In rz6/DIADEM: Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

DIADEM

Authors: Rafal Zaborowski, Bartek Wilczynski

Institution: Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Poland

License: MIT + file LICENSE

For more information please contact r.zaborowski@mimuw.edu.pl or bartek@mimuw.edu.pl

DIADEM is R package for differential analysis of Hi-C data. It takes advantage of significant correlations of main diagonals between different Hi-C data sets (cell lines, experiments, etc.). The number of diagonals (maximum genomic distance between interacting regions) depends on chromosome and data quality but usually will equal to about 5% of total number of bins in given chromosome contact map. DIADEM uses GLM to model relationship between corresponding cells of a pair of Hi-C datasets at given genomic distance and then quantifies deviatons from the model in probabilistic way. The only required input are raw Hi-C contact map files in numpy npz format.

For more details, examples and quick start refer to vignette (invoke browseVignettes(package="DIADEM")). You can also browse documentation of individual functions or objects within the package using standard R syntax (i.e.: help(foo) or ?foo) or have a look at reference manual - to produce it invoke from command line R CMD Rd2pdf path-to-package-directory specifying path to where the package has been installed, usually something like ~/R/x86_64-pc-linux-gnu-library/3.6.2/DIADEM. This will create reference manual file DIADEM.pdf in directory where you invoked building command.

The indepth description of our model together with detailed analysis and motivation is described in manuscript available at: https://www.biorxiv.org/content/10.1101/654699v3.

The code is written in R, but data storage is done with numpy, so main requirements are (versions for which tests were performed are given in parenthesis):

R (3.4.4)
python (2.7.12)
numpy (1.12.1)

Additionally following R packages are required:

magrittr
reticulate
Matrix
fields
parallel
igraph
raster
latex2exp
intervals
robustreg
robustbase
MASS
energy
AER
Rdpack

Following additional packages are required to run examples, make plotting and to build vignette:

devtools
ggplot2
reshape2
bookdown
gridExtra

NOTE: Some of the above R packages require GSL (GNU Scientific Library). Before installation make sure that libgsl-dev is installed (sudo apt-get install libgsl-dev on Ubuntu).

Two ways of installation are possible (both require R package devtools to be installed):

from github repository:

r devtools::install_github("rz6/DIADEM", build_vignettes = TRUE)
from source: clone (:warning: NOTE: it must be cloned with --recursive flag, i.e.: git clone --recursive https://github.com/rz6/DIADEM.git) repository - by default to directory: diadem, cd to directory containing cloned repo, open R and run:

r devtools::install("diadem", build_vignettes = TRUE)

:warning: This repository contains submodule, which must be cloned as well for package to compile. Therefore this repository MUST be cloned with --recursive flag.

Import DIADEM package and list functions inside it:

library("DIADEM")
getNamespaceExports("DIADEM")

A good introduction with some examples and more precise description may be found in vignette. To print it call:

browseVignettes(package="DIADEM")

If the above will not render the vignette you can find it in your package installation directory (typically something like ~/R/x86_64-pc-linux-gnu-library/3.6.2/DIADEM) under doc/DIADEM.html path.

Sample data

DIADEM package contains sample Hi-C contact map as R built-in dataset.

It can be accessed as shown below:
- Hi-C contact maps in sparse format
  
```r library("DIADEM")

file name of MSC-HindIII-1 (also IMR90-MboI-1 dataset is available)

data(sample_hic_data, package = "DIADEM") msc.df <- sample_hic_data[["MSC-HindIII-1"]]

in order to convert contact map to dense format and save in npz file prepare temporary file

mtx.fname.msc <- file.path(tempdir(), "MSC-HindIII-1_40kb-raw.npz")

get chromosome sizes

chr.sizes <- sample_hic_data[["chromosome.sizes"]]

convert to dense matrix format

l <- lapply(names(msc.df), function(chromosome) sparse2dense(dat2[[chromosome]], N = chr.sizes[[chromosome]])) names(l) <- names(msc.df)

save to npz file

save_npz(l, mtx.fname.msc) ```
- Reading Hi-C matrices from npz file
  
```r

given file name from previous example one can read matrices in npz format as follows

sparse.msc <- read_npz(mtx.fname.msc)

or in dense format

dense.msc <- read_npz(mtx.fname.msc, sparse.format = FALSE) ```

rz6/DIADEM documentation built on Dec. 31, 2019, 3:51 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rz6/DIADEM
Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

README.md
In rz6/DIADEM: Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

DIADEM

Quick summary

Prerequisites

Installation

Usage

Sample data

file name of MSC-HindIII-1 (also IMR90-MboI-1 dataset is available)

in order to convert contact map to dense format and save in npz file prepare temporary file

get chromosome sizes

convert to dense matrix format

save to npz file

given file name from previous example one can read matrices in npz format as follows

or in dense format

R Package Documentation

Browse R Packages

We want your feedback!

rz6/DIADEM Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

README.md In rz6/DIADEM: Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

DIADEM

Quick summary

Prerequisites

Installation

Usage

Sample data

file name of MSC-HindIII-1 (also IMR90-MboI-1 dataset is available)

in order to convert contact map to dense format and save in npz file prepare temporary file

get chromosome sizes

convert to dense matrix format

save to npz file

given file name from previous example one can read matrices in npz format as follows

or in dense format

R Package Documentation

Browse R Packages

We want your feedback!

rz6/DIADEM
Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models

README.md
In rz6/DIADEM: Hi-C differential analysis via dependency modelling of chromatin interactions with generalized linear models