calculateBsBackground: Compute background coverage for binding sites per gene

View source: R/differentialFunctions.R

calculateBsBackgroundR Documentation

Compute background coverage for binding sites per gene

Description

This function computes the background coverage used for the differential binding analysis to correct for transcript level changes. Essentially, the crosslink signal on each gene is split into crosslinks that can be attributed to the binding sites and all other signal that can be attributed to the background.

Usage

calculateBsBackground(
  object,
  anno.annoDB = NULL,
  anno.genes = NULL,
  blacklist = NULL,
  use.offset = TRUE,
  ranges.offset = NULL,
  match.geneID.gene = "gene_id",
  match.geneID.bs = "geneID",
  match.geneID.blacklist = "geneID",
  generate.geneID.bs = FALSE,
  generate.geneID.blacklist = FALSE,
  uniqueID.gene = "gene_id",
  uniqueID.bs = "bsID",
  uniqueID.blacklist = "bsID",
  force.unequalSites = FALSE,
  quiet = FALSE,
  veryQuiet = TRUE,
  ...
)

Arguments

object

a BSFDataSet object with two conditions

anno.annoDB

an object of class OrganismDbi that contains the gene annotation.

anno.genes

an object of class GenomicRanges that represents the gene ranges directly

blacklist

GRanges; genomic ranges where the signal should be excluded from the background

use.offset

logical; if an offset region around the binding sites should be used on which the signal is excluded from the background

ranges.offset

numeric; number of nucleotides the offset window around each binding site should be wide (defaults to 1/2 binding site width - NULL)

match.geneID.gene

character; the name of the column with the gene ID in the genes meta columns used for matching binding sites to genes

match.geneID.bs

character; the name of the column with the gene ID in the binding sites meta columns used for matching binding sites to genes

match.geneID.blacklist

character; the name of the column with the gene ID in the blacklist meta columns used for matching the blacklist regions with the genes

generate.geneID.bs

logical; if the binding site to gene matching should be performed if no matching gene ID is provided

generate.geneID.blacklist

logical; if the blacklist to gene matching should be performed if no matching gene ID is provided

uniqueID.gene

character; column name of a unique ID for all genes

uniqueID.bs

character; column name of a unique ID for all binding sites

uniqueID.blacklist

character; column name of a unique ID for all blacklist regions

force.unequalSites

logical; if binding sites of not identical width should be allowed or not

quiet

logical; whether to print messages or not

veryQuiet

logical; whether to print messages or not

...

additional arguments; passed to assignToGenes

Details

To avoid that crosslinks from binding sites contaminate the background counts a protective region around each binding sites can be spanned with use.offset the default width of the offset region is half of the binding site width, but can also be changed with the ranges.offset parameter.

Additional region that one wants to exclude from contributing to the background signal can be incorporated as GRanges objects through the blacklist option.

It is expected that binding sites are assigned to hosting genes prior to running this funciton (see BSFind). This means a unique gene ID is present in the meta columns of each binding site ranges. If this is not the case one can invoce the binding site to gene assignment with generate.geneID.bs. The same is true for the blacklist regions with option generate.geneID.blacklist.

It is expected that all binding sites are of the same size (See BSFind on how to achieve this). If this is however not the case and one wants to keep binding sites of different with then option force.unequalSites can be used.

This function is intended to be used for the generation of the count matrix used for the differential binding analysis. It is usually preceded by combineBSF and followed by filterBsBackground.

Value

an object of class BSFDataSet with counts for binding sites, background and total gene added to the meta column of the ranges

See Also

combineBSF, filterBsBackground

Examples

# load clip data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))
load(list.files(files, pattern = ".rds$", full.names = TRUE)[1])

# make binding sites
bds = makeBindingSites(bds, bsSize = 7)
bds = assignToGenes(bds, anno.genes = gns)

# change meta data
m = getMeta(bds)
m$condition = factor(c("WT", "WT", "KO", "KO"), levels = c("WT", "KO"))
bds = setMeta(bds, m)

# change signal
s = getSignal(bds)
names(s$signalPlus) = paste0(m$id, "_", m$condition)
names(s$signalMinus) = paste0(m$id, "_", m$condition)
bds = setSignal(bds, s)

# make example blacklist region
myBlacklist = getRanges(bds)
set.seed(1234)
myBlacklist = sample(myBlacklist, size = 500) + 4

# make background
bds.b1 = calculateBsBackground(bds, anno.genes = gns)

# make background - no offset
bds.b2 = calculateBsBackground(bds, anno.genes = gns, use.offset = FALSE)

# make background - use blacklist
bds.b3 = calculateBsBackground(bds, anno.genes = gns, blacklist = myBlacklist)


ZarnackGroup/BindingSiteFinder documentation built on May 31, 2024, 3:29 a.m.