Description Usage Arguments Value Note Author(s) References See Also
This function takes mapped positions of fragments pairs (Hi-C data) in a given format (supported formats are "nodup", "maq" or "sam") and genome coordinates of all relevant regions (segmentation)and writes pairwise contact maps for all chromosome pairs. A cell M[i,j] in the pairwise matrix generetaed for a pair of chromosomes, chromosomeA and chromosomeB takes the values of the number of interactions between region i in chromosome A and region j in chromosome B (B and A may be the same chromosome). To improve processing times, this function calls a python executable. Thus, users should verify python (> 2.6) is installed and added to their PATH.
1 2 |
HiCFile |
The name of the Hi-C file. See for example: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM455134 |
segFile |
The name of the segmentation file. The file provides the genomic coordinates of each region. It should be a tab delimited file with the following columns: chormosome, start, end, giving the chromosome name, start and end positions of each region. |
format |
The format of the Hi-C file, taking one of the following values: "nodup", "maq" or "sam" |
outputPrefix |
The prefix of the output files generated by this function. Each file is appended with the name of the 2 chromosomes, that correspond to the output contact map. |
resolution |
An integer value specifying the resolution of the given segmentation, if applicable. Specifically, if the segmentation file defines regions of the same size (for example: 1000000) this variable should be set accordingly. Otherwise it should be set to -1. Note that specifying the resolution greatly improves processing times. |
header |
optional: a boolean specifying whether the segmentation file includes a header or not. Set to FALSE by default |
inclusive |
optional: a boolean specifying whether the segmentation is inclusive. (i.e. whether the end position of one region overlaps with the start position of the next region). Set to FALSE by default. |
verbose |
optional: a boolean specifying whether to report on the progress of the CIM build. Set to TRUE by default. |
combineToSingle |
optional: a boolean specifying whether to also combine all the pairwise matrices into a single matrix and write it to a file.If set to TRUE, an additional file will be written, depending on available memory. Set to TRUE by default. |
This function generates a file for every pairwise chromosomal interaction map from the given input. No value is returned.
Users should note that for large Hi-C files (> 10Gb), the pre-processing time is typically long (30-60 minutes).In order to generate Hi-C mapped positions given raw fragments pairs users should refer to related pipelines such as the HiCuP pipeline (http://www.bioinformatics.babraham.ac.uk/projects/hicup/). Additionally, different Hi-C data sets (raw fragment pairs and mapped positions) are publicly available from the Gene Expression Omnibus (GEO): http://www.ncbi.nlm.nih.gov/geo/
Yoli Shavit
http://www.cl.cam.ac.uk/~ys388/chromoR/
See Also as correctCIM
, correctPairCIM
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.