Hi-C is a powerful technique to understand genome organization. One of the goals when analyzing Hi-C data is to understand which loci in the genome tends to interact. Unfortunately, Hi-C is unbiased and for identifying which interaction is significant, Hi-C data need to be normalized before any process. Although there are several tools developed to normalize Hi-C data but this tool creates different data structure from the raw Hi-C data and many tools which identify significant interaction only worked with one or two of this data structures. In order to Solve this problem, we developed a GOTHiC[1] based tool for identifying the significant interactions. This tool gets the data from different tools such as HiCUP[2,3], HiC-Pro[4,5], HOMER[6] and creates background model.
This tool can identify the significant interactions from Hi-C data which is normalized by different tools such as HiC-Pro. In this part we describe tools and data structures which are accepted by this tool.
Our tool accepts HiC-Pro, HOMER and HiCUP outputs. HiC-Pro outputs include a matrix file with three columns (Information of reads file) and a bed file with four columns (Digest file). Another tool accepted is HiCUP, HiCUP outputs include a sam or a text file with four columns and a digest file which includes Chromosome, Fragment start position and Fragment end position. In the read file structure, any two separate rows with the same id, define an interaction. In order to create this structure, you can use hicup2gothic in the HiCUP package.
This tool accepts HOMER interaction matrix output with interaction counts. This output is a tab-delimited text file which coordinates as a "chr-position" (figure 1).
We developed this tool based on GOTHiC method. And used cumulative binomial tests to identify significant interactions between distal genomic loci that have significantly more reads than expected by the chance in Hi-C experiments.
In this section we describe the main functions (main, binomial_function, MHiC, get_Hic_data) and the parameters used in this tool.
The main function of the tool which take Hi-C data from user and gives back significant interactions for a given bin size. Usage MHiC(reads_file, Digest_file = NULL, sample_name, tools_name= NULL, res = 1000000, cistrans = "all", parallel=FALSE, cores=NULL, removeDiagonal=TRUE) Arguments reads_file: Path of file which containing the reads information. Digest_file: Path of digest file which used to map reads. sample_name: A character string that will be used to name the output from this tool. tools_name: A character string that gives tool name for select method which used in get_Hic_data functions. res: An integer that gives the required bin size or resolution of the contact map. cistrans: A character string with three possibilities. "all" runs the binomial test on all interactions, "cis" runs the binomial test only on intrachromosomal/cis interactions, "trans" runs the binomial test only on interchromosomal/trans interactions. parallel: Logical argument. If TRUE the binomial test will be perform in parallel mode and multiple cores. This option used to improve runtime in tool. cores: An integer specifying the number of cores used in the parallel processing if parellel=TRUE. removeDiagonal: Logical argument. If TRUE the diagonal interaction will be remove and binomial test only apply on non diagonal interaction.
Main process function gets a mapped read and apply the binomial test on this interaction and gives back significant interactions for a given bin size. Usage binomial_function (reads_file, tools_name, cistrans, parallel, cores, removeDiagonal) Arguments reads_file: Path of file which contains the reads information. tools_name: A character string that gives tool name for select method which used in get_Hic_data function. cistrans: A character string with three possibilities. "all" runs the binomial test on all interactions, "cis" runs the binomial test only on intrachromosomal/cis interactions, "trans" runs the binomial test only on interchromosomal/trans interactions. parallel: Logical argument. If TRUE, the binomial test will be performing in parallel mode and multiple cores. This option used to improve runtime in tool. cores: An integer specifying the number of cores used in the parallel processing if parellel=TRUE. removeDiagonal: Logical argument. If TRUE, the diagonal interaction will be remove and binomial test only apply on non-diagonal interaction.
Main import function takes reads and digest file with tools name (required for how to read Hi-C data) and gives back a structured Hi-C which used in binomial_function. Usage get_Hic_data (reads_file, Digest_file= NULL, tools_name) Arguments reads_file: Path of file which containing the reads information Digest_file: Path of digest file which used to map reads. tools_name: A character string that gives tool name for select method which used in get_Hic_data function.
This function is a guide for use MHiC in the command line interface. This function takes required Arguments from user in command line interface and call MHiC function with this Arguments.
Full functionality requires R (tested with version 3.4.2) and work on most operating system.
MHiC is written in R. To install MHiC from github: 1. first install devtools package: install.packages("devtools") 2. Load the devtools package library(devtools) 3. use install_github("author/package"). install_github("Skhakmardan/MHiC") 3.3. Examples: In this part we describe some example for use MHiC tool. Download MHiC_sample_data. Copy data folder in MHiC folde path. 1. Use MHiC function: library(MHiC) dirPath <- system.file("data","HiC-Pro", package="MHiC") fileName1 <- list.files(dirPath, full.names=TRUE)[1] fileName2 <- list.files(dirPath, full.names=TRUE)[2] output<-MHiC(fileName2, fileName1, "dixon_2M_100000", "HiC_PRO", res = 1000000, cistrans = "all", parallel=FALSE, cores=NULL, removeDiagonal=TRUE)
[1] Mifsud, Borbala, Inigo Martincorena, Elodie Darbo, Robert Sugar, Stefan Schoenfelder, Peter Fraser, and Nicholas M. Luscombe. "GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data." PloS one 12, no. 4 (2017): e0174744. [2] Wingett, Steven, Philip Ewels, Mayra Furlan-Magaril, Takashi Nagano, Stefan Schoenfelder, Peter Fraser, and Simon Andrews. "HiCUP: pipeline for mapping and processing Hi-C data." F1000Research 4 (2015). [3] https://www.bioinformatics.babraham.ac.uk/projects/hicup [4] Servant, Nicolas, Nelle Varoquaux, Bryan R. Lajoie, Eric Viara, Chong-Jian Chen, Jean-Philippe Vert, Edith Heard, Job Dekker, and Emmanuel Barillot. "HiC-Pro: an optimized and flexible pipeline for Hi-C data processing." Genome biology 16 (2015). [5] https://github.com/nservant/HiC-Pro [6] http://homer.ucsd.edu/homer
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.