footprintsScanner: scan ATAC-seq footprints infer factor occupancy genome wide

View source: R/footprintsScanner.R

footprintsScannerR Documentation

scan ATAC-seq footprints infer factor occupancy genome wide

Description

Aggregate ATAC-seq footprint for a bunch of motifs generated over binding sites within the genome.

Usage

footprintsScanner(
  bamExp,
  bamCtl,
  indexExp = bamExp,
  indexCtl = bamCtl,
  bindingSitesList,
  seqlev = paste0("chr", c(1:25, "X", "Y")),
  proximal = 40L,
  distal = proximal,
  gap = 10L,
  maximalBindingWidth = NA,
  cutoffLogFC = log2(1.5),
  cutoffPValue = 0.05,
  correlatedFactorCutoff = 3/4
)

prepareBindingSitesList(
  pfms,
  genome,
  seqlev = paste0("chr", c(1:22, "X", "Y")),
  expSiteNum = 5000
)

Arguments

bamExp

A vector of characters indicates the file names of experiment bams. The bam file must be the one with shifted reads.

bamCtl

A vector of characters indicates the file names of control bams. The bam file must be the one with shifted reads.

indexExp, indexCtl

The names of the index file of the 'BAM' file being processed; This is given without the '.bai' extension.

bindingSitesList

A object of GRangesList indicates candidate binding sites (eg. the output of fimo).

seqlev

A vector of characters indicates the sequence levels.

proximal, distal

numeric(1) or integer(1). basepair for open region from binding sites (proximal) and extented region for background (distal) of the binding region for aggregate ATAC-seq footprint.

gap

numeric(1) or integer(1). basepair for gaps among binding sites, proximal, and distal. default is 5L.

maximalBindingWidth

numeric(1) or integer(1). Maximal binding sites width for all the motifs. If setted, all motif binding sites will be re-sized to this value.

cutoffLogFC, cutoffPValue

numeric(1). Cutoff value for differential bindings.

correlatedFactorCutoff

numeric(1). Cutoff value for correlated factors. If the overlapping binding site within 100bp is more than cutoff, the TFs will be treated as correlated factors.

pfms

A list of Position frequency Matrix represented as a numeric matrix with row names A, C, G and T.

genome

An object of BSgenome.

expSiteNum

numeric(1). Expect number of predicted binding sites. if predicted binding sites is more than this number, top expSiteNum binding sites will be used.

Value

a list. It includes: - bindingSites GRanges of binding site with hits of reads - data a list with test result for each binding site - results a data.frame with open score and enrichment score of motifs

Author(s)

Jianhong Ou

Examples


bamfile <- system.file("extdata", "GL1.bam",
                       package="ATACseqQC")
bsl <- system.file("extdata", "jolma2013.motifs.bindingList.95.rds",
                  package="ATACseqQC")
bindingSitesList <- readRDS(bsl)
footprintsScanner(bamfile, seqlev="chr1", bindingSitesList=bindingSitesList)

library(MotifDb)
motifs <- query(MotifDb, c("Hsapiens"))
motifs <- as.list(motifs)
library(BSgenome.Hsapiens.UCSC.hg19)
#bindingSitesList <- prepareBindingSitesList(motifs, genome=Hsapiens)

jianhong/ATACseqQC documentation built on Nov. 2, 2024, 12:08 a.m.