modelBest10Avg: Default parameters for logistic regression model in sevenC.

Description Usage Format Details References See Also

Description

This dataset contains term names and estimates for logistic regression model to predict chromatin looping interactions. The estimate represent an average of the 10 best performing models out of 124 transcription factor ChIP-seq data sets from ENCODE.

Usage

1

Format

An object of class data.frame with 7 rows and 2 columns holding the term name and estimate.

(Intercept)

The intercept of the logistic regression model.

dist

The genomic distance between the centers of motifs in base pairs (bp).

strandOrientationdivergent

Orientation of motif pairs. 1 if divergent 0 if not.

strandOrientationforward

Orientation of motif pairs. 1 if forward 0 if not.

strandOrientationreverse

Orientation of motif pairs. 1 if reverse 0 if not.

score_min

Minimum of motif hit score between both motifs in pair. The motif score is defined as -log_10 of the p-value of the motif hit as reported by JASPAR motif tracks. The unit is -log_10(p) where p is the p-value of the motif hit.

cor

Pearson correlation coefficient of ChIP-seq signals across +/- 500 bp around CTCF motif centers.

Details

Each of 124 transcription factor (TF) ChIP-seq data sets from ENCODE in GM12878 cells were used to train a logistic regression model. All CTCF motifs in motif.hg19.CTCF within a distance of 1 Mb were used as candidates. A given pair was labled as true loop interactions, if it has interaction support based on loops from Hi-C in human GM12878 cells from Rao et al. 2014 or ChIA-PET loops from Tang et al. 2015 in the same cell type. The 10 best performing models were selected based on the average area under the precision-recall-curve in 10-fold cross-validation. The parameters were than averaged across the 10 best performig models.

References

Suhas S.P. Rao, Miriam H. Huntley, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov, James T. Robinson, Adrian L. Sanborn, Ido Machol, Arina D. Omer, Eric S. Lander, Erez Lieberman Aiden, A 3D Map of the Human Genome at Kilobase Resolution Reveals #' Principles of Chromatin Looping, Cell, Volume 159, Issue 7, 18 December 2014, Pages 1665-1680, ISSN 0092-8674, https://doi.org/10.1016/j.cell.2014.11.021.

Zhonghui Tang, Oscar Junhong Luo, Xingwang Li, Meizhen Zheng, Jacqueline Jufen Zhu, Przemyslaw Szalaj, Pawel Trzaskoma, Adriana Magalska, Jakub Wlodarczyk, Blazej Ruszczycki, Paul Michalski, Emaly Piecuch, Ping Wang, Danjuan Wang, Simon Zhongyuan Tian, May Penrad-Mobayed, Laurent M. Sachs, Xiaoan Ruan, Chia-Lin Wei, Edison T. Liu, Grzegorz M. Wilczynski, Dariusz Plewczynski, Guoliang Li, Yijun Ruan, CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription, Cell, Volume 163, Issue 7, 17 December 2015, Pages 1611-1627, ISSN 0092-8674, https://doi.org/10.1016/j.cell.2015.11.024.

See Also

cutoffBest10 and TFspecificModels


sevenC documentation built on Nov. 8, 2020, 5:35 p.m.