TFEA: Transcription factor enrichment analysis

View source: R/TFEA.R

TFEAR Documentation

Transcription factor enrichment analysis

Description

Transcription factor enrichment analysis for ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing). We treat all the binding sites for one TF as a TF set and all the open regions as features for random walking.

Usage

TFEA(
  bamExp,
  bamCtl,
  indexExp = bamExp,
  indexCtl = bamCtl,
  positive = 4L,
  negative = 5L,
  bindingSites,
  proximal = 40L,
  distal = proximal,
  gap = 10L,
  filter = "proximalRegion>0",
  openscoreZcutoff = 0,
  bindingScoreLog2FCcutoff = 0,
  bindingScorePvalCutoff = 1
)

Arguments

bamExp

A vector of characters indicates the file names of experiment bams. The bam file must be the one with shifted reads.

bamCtl

A vector of characters indicates the file names of control bams. The bam file must be the one with shifted reads.

indexExp, indexCtl

The names of the index file of the 'BAM' file being processed; This is given without the '.bai' extension.

positive, negative

integer(1). the size to be shift for positive/negative strand. If the bam file is 5'end shifed files, please set the parameter to 0.

bindingSites

A object of GenomicRanges indicates candidate binding sites. The prepareBindingSites function is a helper function to generate the binding sites. Users can also use other software for example fimo to generate the list.

proximal, distal

numeric(1) or integer(1). bases for open region from binding sites (proximal) and extended region for background (distal) of the binding region for aggregate ATAC-seq footprint.

gap

numeric(1) or integer(1). bases for gaps among binding sites, proximal, and distal. default is 10L.

filter

An expression which, when evaluated in the context of assays(se), is a logical vector indicating elements or rows to keep. The expression results for each assay will be combined and use 'or' operator to filter the counts assays.

openscoreZcutoff

Open score Z value cutoff value. Default is 0. Open score is calculated by the count ratio of proximal site and distal site.

bindingScorePvalCutoff, bindingScoreLog2FCcutoff

Binding score cutoff values. Default is 1 and 0. Binding score is calculated by the count ratio of proximal site and binding site. The cutoff values are used to decrease the total number of binding site for ranking. Increasing the 'log2FCcutoff' value and decreasing the P-value cutoff value can greatly decrease the memory cost and computing time by decreasing the total binding sites.

Value

A TFEAresults object.

Author(s)

Jianhong Ou

Examples


bamExp <- system.file("extdata",
                      c("KD.shift.rep1.bam",
                        "KD.shift.rep2.bam"),
                      package="ATACseqTFEA")
bamCtl <- system.file("extdata",
                      c("WT.shift.rep1.bam",
                        "WT.shift.rep2.bam"),
                      package="ATACseqTFEA")
bsl <- system.file("extdata", "bindingSites.rds",
                   package="ATACseqTFEA")
bindingSites <- readRDS(bsl)
res <- TFEA(bamExp, bamCtl, bindingSites=bindingSites,
            positive=0, negative=0)
res

jianhong/ATACseqTFEA documentation built on Feb. 9, 2024, 4:09 a.m.