filterBsBackground: Filter for genes not suitable for differential testing

filterBsBackgroundR Documentation

Filter for genes not suitable for differential testing

Description

This function removes genes where the differential testing protocol can not be applied to, using count coverage information on the binding sites and background regions per gene, through the following steps:

  1. Remove genes with overall not enough crosslinks: minCounts

  2. Remove genes with a disproportion of counts in binding sites vs. the background: balanceBackground

  3. Remove genes where the expression between conditions is too much off balance: balanceCondition

Usage

filterBsBackground(
  object,
  minCounts = TRUE,
  minCounts.cutoff = 100,
  balanceBackground = TRUE,
  balanceBackground.cutoff.bs = 0.3,
  balanceBackground.cutoff.bg = 0.7,
  balanceCondition = TRUE,
  balanceCondition.cutoff = 0.05,
  match.geneID = "geneID",
  flag = FALSE,
  quiet = FALSE,
  veryQuiet = FALSE
)

Arguments

object

a BSFDataSet object with computed count data for binding sites and background regions

minCounts

logical; whether to use the minimum count filter

minCounts.cutoff

numeric; the minimal number of crosslink per gene over all samples for the gene to be retained (default = 100)

balanceBackground

logical; whether to use the counts balancing filter between binding sites and background

balanceBackground.cutoff.bs

numeric; the maximum fraction of the total signal per gene that can be within binding sites (default = 0.2)

balanceBackground.cutoff.bg

numeric; the minimum fraction of the total signal per gene that can be within the background (default = 0.8)

balanceCondition

logical; whether to use the counts balancing filter between conditions

balanceCondition.cutoff

numeric; the maximum fraction of the total signal that can be attributed to only one condition

match.geneID

character; the name of the column with the gene ID in the binding sites meta columns used for matching binding sites to genes

flag

logical; whether to remove or flag binding sites from genes that do not pass any of the filters

quiet

logical; whether to print messages or not

veryQuiet

logical; whether to print messages or not

Details

To remove genes with overall not enough crosslinks (minCounts) all counts are summed up per gene across all samples and compared to the minimal count threshold (minCounts.cutoff).

To remove genes with a count disproportion between binding sites and background regions crosslinks are summed up for binding sites and background per gene. These sums are combined in a ratio. Genes where eg. 50% of all counts are within binding sites would be removed (see balanceBackground.cutoff.bs and balanceBackground.cutoff.bg).

To remove genes with very large expression differences between conditions, crosslinks are summed up per gene for each condition. If now eg. the total number of crosslinks is for 98% in one condition and only 2% of the combined signal is in the second condition, expression levels are too different for a reliable comparisson (see balanceCondition.cutoff).

This function is intended to be used right after a call of calculateBsBackground.

Value

an object of class BSFDataSet with biniding sites filtered or flagged by the above filter options

See Also

calculateBsBackground, plotBsBackgroundFilter

Examples

# load clip data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))
load(list.files(files, pattern = ".rds$", full.names = TRUE)[1])

# make testset
bds = makeBindingSites(bds, bsSize = 7)
bds = assignToGenes(bds, anno.genes = gns)
bds = imputeBsDifferencesForTestdata(bds)
bds = calculateBsBackground(bds, anno.genes = gns, use.offset = FALSE)

# use all filters and remove binding sites that fail (default settings)
f0 = filterBsBackground(bds)

# do not use the condition balancing filter
f1 = filterBsBackground(bds, balanceCondition = FALSE)

# use only the minimum count filter and flag binding sites instead of
# removing them
f3 = filterBsBackground(bds, flag = TRUE, balanceCondition = FALSE,
 balanceBackground = FALSE)


ZarnackGroup/BindingSiteFinder documentation built on May 31, 2024, 3:29 a.m.