Genome Organisation Through HiC

Share:

Description

GOTHiC performs a cumulative binomial test to detect interactions between distal genomic loci that have significantly more reads than expected by chance in Hi-C experiments. It takes mapped paired NGS reads as input and gives back the list of significant interactions for a given bin size in the genome.

Usage

1
2
3
4
5
GOTHiC(fileName1, fileName2, sampleName, res,
BSgenomeName='BSgenome.Hsapiens.UCSC.hg19',
genome=BSgenome.Hsapiens.UCSC.hg19, restrictionSite='A^AGCTT',
enzyme='HindIII',cistrans='all',filterdist=10000,
DUPLICATETHRESHOLD=1, fileType='BAM', parallel=FALSE, cores=NULL)

Arguments

fileName1

File containing the mapped reads of the first fragment ends (BAM or Bowtie format)

fileName2

File containing the mapped reads of the second fragment ends (BAM or Bowtie format)

sampleName

A character string that will be used to name the exported BedGraph file containing the coverage, R object files with paired and mapped reads, and the final data frame with the results from the binomial test. They will be saved in the current directory.

res

An integer that gives the required bin size or resolution of the contact map e.g. 1000000.

BSgenomeName

A character string of the BSgenome package required to make the restriction fragment file containing information for both the organism the experiment was made in, and the genome version the reads were mapped to. The default is the current human genome build 'BSgenome.Hsapiens.UCSC.hg19'.

genome

The BSgenome package required to make the restriction fragment file containing information for both the organism the experiment was made in, and the genome version the reads were mapped to. The default is the current human genome build BSgenome.Hsapiens.UCSC.hg19.

restrictionSite

A character string that specifies the enzymes recognition site, ^ indicating where the enzyme actually cuts. The default is the HindIII restriction site: 'A^AGCTT'.

enzyme

A character string containing the name of the enzyme used during the Hi-C experiment (i.e. "HindIII", "NcoI"). The default is "HindIII".

cistrans

A character string with three possibilities. "all" runs the binomial test on all interactions, "cis" runs the binomial test only on intrachromosomal/cis interactions, "trans" runs the binomial test only on interchromosomal/trans interactions.

filterdist

An integer specifying the distance between the midpoint of fragments under which interactions are filtered out in order to filter for those read-pairs where the digestion was incomplete. The default is 10000.

DUPLICATETHRESHOLD

An integer specifying the maximum amount of duplicated paired-end reads allowed, over that value it is expected to be PCR bias. The default is 1.

fileType

A character string specifying the format of the aligned reads. The default is 'BAM'. Other accepted format is 'Bowtie'.

parallel

Logical argument. If TRUE the mapping and the binomial test will be performed faster using multiple cores. The default is FALSE.

cores

An integer specifying the number of cores used in the parallel processing if parellel=TRUE. The default is NULL.

Value

A data.frame containing elements

chr1 / chr2

chromosome(s) containing interacting regions 1 and 2

locus1 / locus2

start positions of the interacting regions 1 and 2 in the corresponding chromosome(s)

relCoverage1 / relCoverage2

relative coverage corresponding to regions 1 and 2

probability

expected frequency

expected

expected number of reads

readCount

observed reads number

pvalue

binomial p-value

qvalue

binomial p-value corrected for multi-testing with Benjamini-Hochberg

logObservedOverExpected

observed/expected read numbers log ratio

Author(s)

Borbala Mifsud and Robert Sugar

See Also

binom.test, pairReads, mapReadsToRestrictionSites

Examples

1
2
3
4
5
6
7
8
library(GOTHiC)
dirPath <- system.file("extdata", package="HiCDataLymphoblast")
fileName1 <- list.files(dirPath, full.names=TRUE)[1]
fileName2 <- list.files(dirPath, full.names=TRUE)[2]
binom=GOTHiC(fileName1, fileName2, sampleName='lymphoid_chr20', res=1000000, 
BSgenomeName='BSgenome.Hsapiens.UCSC.hg18', genome=BSgenome.Hsapiens.UCSC.hg18, 
restrictionSite='A^AGCTT', enzyme='HindIII',cistrans='all', filterdist=10000, 
DUPLICATETHRESHOLD=1, fileType='Table', parallel=FALSE, cores=NULL)