library(GenomicInteractionNodes)
library(rtracklayer)
library(TxDb.Hsapiens.UCSC.hg19.knownGene) 
library(org.Hs.eg.db)
knitr::opts_chunk$set(warning=FALSE, message=FALSE)

Introduction

The profiles of genome interactions can be categorized into three levels: loops, domains and compartments. Cis-regulatory elements (cREs) regulate the distal target gene expression through three-dimensional chromatin looping. Most of these loops occur within the boundaries of topologically associating domains (TADs). The TAD boundaries are enriched for insulator binding proteins. HiC analysis revealed the TADs can be segregated into an active (A) and inactive (B) compartment. TADs are megabase-sized chromosomal regions where the loop sizes may very from thousand bases to mega bases. We know that not all the loops within one TADs are involved in single binding platform. There are hundreds or even thousands binding platforms even in a single TAD. Batut et al. report the 'tethering element' help to establish long-range enhancer-promoter interactions. Tethering elements are the one kind of the binding platform which bring together enhancers and promoters for rapid gene activation. For each binding platform, multiple loops may work together to work as a single function, such as repression or promotion one or more distal target gene expression. That means multiple promoters and enhancers may bind together to perform as super enhancer or regulatory elements.

We defined such kind of binding platform as genomic interaction node or interaction hot spot. The interaction node can be a tethering element working as node, or a super-enhancer (or super regulatory element region). It is a level of genome organization higher than loops but smaller than TADs. It is a kind of tethering element with multiple interaction loops.

The interaction node has two attributes: 1. it must contain multiple interaction loops, 2. it regulates one or more target genes.

To help users to define the interaction nodes, we developed the Bioconductor package: GenomicInteractionNodes. The GenomicInteractionNodes package will define the interaction node by testing the involved loops within one connected component by Poisson distribution. The annotated loops will be used for enrichment analysis.

Installation

You can install the package via devtools::install_github from github.

library(devtools)
install_github("jianhong/GenomicInteractionNodes")

You can also try to install it via BiocManager::install when it is ready in Bioconductor.

library(BiocManager)
install("GenomicInteractionNodes")

Quick start

Here is an example to use GenomicInteractionNodes to define interaction nodes and do downstream enrichment analysis.

There are three steps,

  1. detectNodes, define the nodes.

  2. annotateNodes, annotate the nodes for downstream analysis.

  3. enrichmentAnalysis, Gene Ontology enrichment analysis.

## load library
library(GenomicInteractionNodes)
library(rtracklayer)
library(TxDb.Hsapiens.UCSC.hg19.knownGene) ## for human hg19
library(org.Hs.eg.db) ## used to convert gene_id to gene_symbol
library(GO.db) ## used for enrichment analysis

## import the interactions from bedpe file
p <- system.file("extdata", "WT.2.bedpe",
                 package = "GenomicInteractionNodes")
#### please try to replace the connection to your file path
interactions <- import(con=p, format="bedpe")

## define the nodes
nodes <- detectNodes(interactions)
names(nodes)
lapply(nodes, head, n=2)

## annotate the nodes by gene promoters
node_regions <- nodes$node_regions
node_regions <- annotateNodes(node_regions,
                        txdb=TxDb.Hsapiens.UCSC.hg19.knownGene,
                        orgDb=org.Hs.eg.db,
                        upstream=2000, downstream=500)
head(node_regions, n=2)

## Gene Ontology enrichment analysis
enrich <- enrichmentAnalysis(node_regions, orgDb=org.Hs.eg.db)
names(enrich$enriched)
names(enrich$enriched_in_compound)
res <- enrich$enriched$BP
res <- res[order(res$fdr), ]
head(res, n=2)

Session Info

sessionInfo()


jianhong/GenomicInteractionHub documentation built on April 14, 2022, 10:49 a.m.