MatToVec: Convert the HiC matrix format to vector format

Description Usage Arguments Value References Examples

View source: R/MatToVec.R

Description

The matrix format is the standard input for the HiCRep reproducibility analysis. It has the dimension of N*(3+N). The additional first three columns are chromosome name, and mid-point coordinates of two contacting bins. The converted format has three columns. The first two columns are mid-point coordinates of two contacting bins, and the third column is the reads number in each bin.

Usage

1
MatToVec(dat)

Arguments

dat

a Hi-C intra-chromosome matrix in the format of N*N (No chromsome name and coordinates columns).

Value

a vectorized Hi-C data. The first two columns are mid-point coordinates of the two contacting bins. The third column is read numbers of the contacts.

References

HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Tao Yang, Feipeng Zhang, Galip Gurkan Yardimci, Ross C Hardison, William Stafford Noble, Feng Yue, Qunhua Li. bioRxiv 101386; doi: https://doi.org/10.1101/101386.

Examples

1
2
3
4
5
6
7
8
9
data(HiCR1)

#re-format the row and column names
resol <- 1000000 
ref_Rep1 <- HiCR1[,-c(1,2,3)]
rownames(ref_Rep1) = colnames(ref_Rep1) = HiCR1[,3]-resol/2

vec_HiC_R1 <- MatToVec(ref_Rep1)
head(vec_HiC_R1)

Example output

      [,1]    [,2] [,3]
[1,] 5e+05  500000    0
[2,] 5e+05 1500000    0
[3,] 5e+05 2500000    0
[4,] 5e+05 3500000    0
[5,] 5e+05 4500000    0
[6,] 5e+05 5500000    0

hicrep documentation built on April 28, 2020, 7:51 p.m.